Prostate-specific gene, PCGEM1, and methods of using PCGEM1 to detect, treat, and prevent prostate cancer

ABSTRACT

A nucleic acid sequence that exhibits prostate-specific expression and over-expression in tumor cells is disclosed. The sequence and fragments thereof are useful for detecting, diagnosing, preventing, and treating prostate cancer and other prostate related diseases. The sequence is also useful for measuring hormone responsiveness of prostate cancer cells.

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] The present application is based upon U.S. provisionalapplication S.No. 60/126,469, filed Mar. 26, 1999, priority to which isclaimed under 35 U.S.C. § 119(e). The entire disclosure of U.S.provisional application S.No. 60/126,469 is expressly incorporatedherein by reference.

GOVERNMENT INTEREST

[0002] The invention described herein may be manufactured, licensed, andused for governmental purposes without payment of royalties to usthereon.

FIELD OF THE INVENTION

[0003] The present invention relates to nucleic acids that are expressedin prostate tissue. More particularly, the present invention relates tothe first of a family of novel, androgen-regulated, prostate-specificgenes, PCGEM1, that is over-expressed in prostate cancer, and methods ofusing the PCGEM1 sequence and fragments thereof to measure the hormoneresponsiveness of prostate cancer cells and to detect, diagnose, preventand treat prostate cancer and other prostate related diseases.

BACKGROUND

[0004] Prostate cancer is the most common solid tumor in American men(1). The wide spectrum of biologic behavior (2) exhibited by prostaticneoplasms poses a difficult problem in predicting the clinical coursef6r the individual patient (3, 4). Public awareness of prostate specificantigen (PSA) screening efforts has led to an increased diagnosis ofprostate cancer. The increased diagnosis and greater number of patientspresenting with prostate cancer has resulted in wider use of radicalprostatectomy for localized disease (5). Accompanying the rise insurgical intervention is the frustrating realization of the inability topredict organ-confined disease and clinical outcome for a given patient(5, 6). Traditional prognostic markers, such as grade, clinical stage,and pretreatment PSA have limited prognostic value for individual men.There is clearly a need to recognize and develop molecular and geneticbiomarkers to improve prognostication and the management of patientswith clinically localized prostate cancer. As with other common humanneoplasia (7), the search for molecular and genetic biomarkers to betterdefine the genesis and progression of prostate cancer is the key focusfor cancer research investigations worldwide.

[0005] The new wave of research addressing molecular genetic alterationsin prostate cancer is primarily due to increased awareness of thisdisease and the development of newer molecular technologies. The searchfor the precursor of prostatic adenocarcinoma has focused largely on thespectrum of microscopic changes referred to as “prostaticintraepithelial neoplasia” (PIN). Bostwick defines this spectrum as ahistopathologic continuum that culminates in high grade PIN and earlyinvasive cancer (8). The morphologic and molecular changes include theprogressive disruption of the basal cell-layer, changes in theexpression of differentiation markers of the prostatic secretoryepithelial cells, nuclear and nucleolar abnormalities, increased cellproliferation, DNA content alterations, and chromosomal and alleliclosses (8, 9). These molecular and genetic biomarkers, particularlytheir progressive gain or loss, can be followed to trace the etiology ofprostate carcinogenesis. Foremost among these biomarkers would be themolecular and genetic markers associated with histological phenotypes intransition between normal prostatic epithelium and cancer. Most studiesso far seem to agree that PIN and prostatic adenocarcinoma cells have alot in common with each other. The invasive carcinoma more oftenreflects a magnification of some of the events already manifest in PIN.

[0006] Early detection of prostate cancer is possible today because ofthe widely propagated and recommended blood PSA test that provides awarning signal for prostate cancer if high levels of serum PSA aredetected. However, when used alone, PSA is not sufficiently sensitive orspecific to be considered an ideal tool for the early detection orstaging of prostate cancer (10). Combining PSA levels with clinicalstaging and Gleason scores is more predictive of the pathological stageof localized prostate cancer (11). In addition, new molecular techniquesare being used for improved molecular staging of prostate cancer (12,13). For instance, reverse transcriptase—polymerase chain reaction(RT-PCR) can measure PSA of circulating prostate cells in blood and bonemarrow of prostate cancer patients.

[0007] Despite new molecular techniques, however, as many as 25 percentof men with prostate cancer will have normal PSA levels—usually definedas those equal to or below 4 nanograms per milliliter of blood (14). Inaddition, more than 50 percent of the men with higher PSA levels areactually cancer free (14). Thus, PSA is not an ideal screening tool forprostate cancer. More reliable tumor-specific biomarkers are needed thatcan distinguish between normal and hyperplastic epithelium, and thepreneoplastic and neoplastic stages of prostate cancer.

[0008] Identification and characterization of genetic alterationsdefining prostate cancer onset and progression is important inunderstanding the biology and clinical course of the disease. Thecurrently available TNM staging system assigns the original primarytumor (T) to one of four stages (14). The first stage, T1, indicatesthat the tumor is microscopic and cannot be felt on rectal examination.T2 refers to tumors that are palpable but fully contained within theprostate gland. A T3 designation indicates the cancer has spread beyondthe prostate into surrounding connective tissue or has invaded theneighboring seminal vesicles. T4 cancer has spread even further. The TNMstaging system also assesses whether the cancer has metastasized to thepelvic lymph nodes (N) or beyond (M). Metastatic tumors result whencancer cells break away from the original tumor, circulate through theblood or lymph, and proliferate at distant sites in the body.

[0009] Recent studies of metastatic prostate cancer have shown asignificant heterogeneity of allelic losses of different chromosomeregions between multiple cancer foci (21-23). These studies have alsodocumented that the metastatic lesion can arise from cancer foci otherthan dominant tumors (22). Therefore, it is critical to understand themolecular changes which define the prostate cancer metastasis especiallywhen prostate cancer is increasingly detected in early stages (15-21).

[0010] Moreover, the multifocal nature of prostate cancer needs to beconsidered (22-23) when analyzing biomarkers that may have potential topredict tumor progression or metastasis. Approximately 50-60% ofpatients treated with radical prostatectomy for localized prostatecarcinomas are found to have microscopic disease that is not organconfined, and a significant portion of these patients relapse (24).Utilizing biostatistical modeling of traditional and genetic biomarkerssuch as p53 and bcl-2, Bauer et al. (25-26) were able to identifypatients at risk of cancer recurrence after surgery. Thus, there isclearly a need to develop biomarkers defining various stages of theprostate cancer progression.

[0011] Another significant aspect of prostate cancer is the key rolethat androgens play in the development of both the normal prostate andprostate cancer. Androgen ablation, also referred to as “hormonaltherapy,” is a common treatment for prostate cancer, particularly inpatients with metastatic disease (14). Hormonal therapy aims to inhibitthe body from making androgens or to block the activity of androgen. Oneway to block androgen activity involves blocking the androgen receptor;however, that blockage is often only successful initially. For example,70-80% of patients with advanced disease exhibit an initial subjectiveresponse to hormonal therapy, but most tumors progress to anandrogen-independent state within two years (16). One mechanism proposedfor the progression to an androgen-independent state involvesconstitutive activation of the androgen signaling pathway, which couldarise from structural changes in the androgen receptor protein (16).

[0012] As indicated above, the genesis and progression of cancer cellsinvolve multiple genetic alterations as well as a complex interaction ofseveral gene products. Thus, various strategies are required to fullyunderstand the molecular genetic alterations in a specific type ofcancer. In the past, most molecular biology studies had focused onmutations of cellular proto-oncogenes and tumor suppressor genes (TSGs)associated with prostate cancer (7). Recently, however, there has beenan increasing shift toward the analysis of “expression genetics” inhuman cancer (27-31), i.e., the under-expression or over-expression ofcancer-specific genes. This shift addresses limitations of the previousapproaches including: 1) labor intensive technology involved inidentifying mutated genes that are associated with human cancer; 2) thelimitations of experimental models with a bias toward identification ofonly certain classes of genes, e.g., identification of mutant ras genesby transfection of human tumor DNAs utilizing NIH3T3 cells; and 3) therecognition that the human cancer associated genes identified so far donot account for the diversity of cancer phenotypes.

[0013] A number of studies are now addressing the alterations ofprostate cancer-associated gene expression in patient specimens (32-36).It is inevitable that more reports on these lines are to follow.

[0014] Thus, despite the growing body of knowledge regarding prostatecancer, there is still a need in the art to uncover the identity andfunction of the genes involved in prostate cancer pathogenesis. There isalso a need for reagents and assays to accurately detect cancerouscells, to define various stages of prostate cancer progression, toidentify and characterize genetic alterations defining prostate canceronset and progression, to detect micro-metastasis of prostate cancer,and to treat and prevent prostate cancer.

SUMMARY OF THE INVENTION

[0015] The present invention relates to the identification andcharacterization of a novel gene, the first of a family of genes,designated PCGEM1, for Prostate Cancer Gene Expression Marker 1. PCGEM1is specific to prostate tissue, is androgen-regulated, and appears to beover-expressed in prostate cancer. More recent studies associate PCGEM1cDNA with promoting cell growth. The invention provides the isolatednucleotide sequence of PCGEM1 or fragments thereof and nucleic acidsequences that hybridize to PCGEM1. These sequences have utility, forexample, as markers of prostate cancer and other prostate relateddiseases, and as targets for therapeutic intervention in prostate cancerand other prostate related diseases. The invention further provides avector that directs the expression of PCGEM1, and a host celltransfected or transduced with this vector.

[0016] In another embodiment, the invention provides a method ofdetecting prostate cancer cells in a biological sample, for example, byusing nucleic acid amplification techniques with primers and probesselected to bind specifically to the PCGEM1 sequence. The inventionfurther comprises a method of selectively killing a prostate cancercell, a method of identifying an androgen responsive cell line, and amethod of measuring responsiveness of a cell line to hormone-ablationtherapy.

[0017] In another aspect, the invention relates to an isolatedpolypeptide encoded by the PCGEM1 gene or a fragment thereof, andantibodies generated against the PCGEM1 polypeptide, peptides, orportions thereof, which can be used to detect, treat, and preventprostate cancer.

[0018] Additional features and advantages of the invention will be setforth in the description which follows, and in part will be apparentfrom the description, or may be learned by practice of the invention.The objectives and other advantages of the invention will be realizedand attained by the sequences, cells, vectors, and methods particularlypointed out in the written description and claims herein as well as theappended drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0019]FIG. 1 depicts the scheme for the identification of differentiallyexpressed genes in prostate tumor and normal tissues.

[0020]FIG. 2 depicts a differential display pattern of mRNA obtainedfrom matched tumor and normal tissues of a prostate cancer patient.Arrows indicate differentially expressed cDNAs.

[0021]FIG. 3 depicts the analysis of PCGEM1 expression in primaryprostate cancers.

[0022]FIG. 4 depicts the expression pattern of PCGEM1 in prostate cancercell lines.

[0023]FIG. 5a depicts the androgen regulation of PCGEM1 expression inLNCaP cells, as measured by reverse transcriptase PCR.

[0024]FIG. 5b depicts the androgen regulation of PCGEM1 expression inLNCaP cells, as measured by Northern blot hybridization.

[0025]FIG. 6a depicts the prostate tissue specific expression pattern ofPCGEM1.

[0026]FIG. 6b depicts a RNA master blot showing the prostate tissuespecificity of PCGEM1.

[0027]FIG. 7A depicts the chromosomal localization of PCGEM1 byfluorescent in situ hybridization analysis.

[0028]FIG. 7B depicts a DAPI counter-stained chromosome 2 (left), aninverted DAPI stained chromosome 2 shown as G-bands (center), and anideogram of chromosome 2 showing the localization of the signal to band2q32(bar).

[0029]FIG. 8 depicts a cDNA sequence of PCGEM1 (SEQ ID NO:1).

[0030]FIG. 9 depicts an additional cDNA sequence of PCGEM1 (SEQ IDNO:2).

[0031]FIG. 10 depicts the colony formation of NIH3T3 cell linesexpressing various PCGEM1 constructs.

[0032]FIG. 11 depicts the cDNA sequence of the promoter region of PCGEM1SEQ ID NO:3.

[0033]FIG. 12 depicts the cDNA of a probe, designated SEQ ID NO:4.

[0034]FIG. 13 depicts the cDNAs of primers 1-3, designated SEQ IDNOs:5-7, respectively.

[0035]FIG. 14 depicts the genomic DNA sequence of PCGEM1, designated SEQID NO:8.

[0036]FIG. 15 depicts the structure of the PCGEM1 transcription unit.

[0037]FIG. 16 depicts a graph of the hypothetical coding capacity ofPCGEM1.

[0038]FIG. 17 depicts a representative example of in situ hybridizationresults showing PCGEM1 expression in normal and tumor areas of prostatecancer tissues.

DETAILED DESCRIPTION OF THE INVENTION

[0039] The present invention relates to PCGEM1, the first of a family ofgenes, and its related nucleic acids, proteins, antigens, and antibodiesfor use in the detection, prevention, and treatment of prostate cancer(e.g., prostatic intraepithelial neoplasia (PIN), adenocarcinomas,nodular hyperplasia, and large duct carcinomas) and prostate relateddiseases (e.g., benign prostatic hyperplasia), and kits comprising thesereagents.

[0040] Although we do not wish to be limited by any theory orhypothesis, preliminary data suggest that the PCGEM1 nucleotide sequencemay be related to a family of non-coding poly A+RNA that may beimplicated in processes relating to growth and embryonic development(40-44). Evidence presented herein supports this hypothesis.Alternatively, PCGEM1 cDNA may encode a small peptide.

[0041] Nucleic Acid Molecules

[0042] In a particular embodiment, the invention relates to certainisolated nucleotide sequences that are substantially free fromcontaminating endogenous material. A “nucleotide sequence” refers to apolynucleotide molecule in the form of a separate fragment or as acomponent of a larger nucleic acid construct. The nucleic acid moleculehas been derived from DNA or RNA isolated at least once in substantiallypure form and in a quantity or concentration enabling identification,manipulation, and recovery of its component nucleotide sequences bystandard biochemical methods (such as those outlined in Sambrook et al.,Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring HarborLaboratory, Cold Spring Harbor, N.Y. (1989)).

[0043] Nucleic acid molecules of the invention include DNA in bothsingle-stranded and double-stranded form, as well as the RNA complementthereof. DNA includes, for example, cDNA, genomic DNA, chemicallysynthesized DNA, DNA amplified by PCR, and combinations thereof. GenomicDNA may be isolated by conventional techniques, e.g., using the cDNA ofSEQ ID NO:1, SEQ ID NO:2, or suitable fragments thereof, as a probe.

[0044] The DNA molecules of the invention include full length genes aswell as polynucleotides and fragments thereof. The full length gene mayinclude the N-terminal signal peptide. Although a non-coding role ofPCGEM1 appears likely, the possibility of a protein product cannotpresently be ruled out. Therefore, other embodiments may include DNAencoding a soluble form, e.g., encoding the extracellular domain of theprotein, either with or without the signal peptide.

[0045] The nucleic acids of the invention are preferentially derivedfrom human sources, but the invention includes those derived fromnon-human species, as well.

[0046] Preferred Sequences

[0047] Particularly preferred nucleotide sequences of the invention areSEQ ID NO:1, SEQ ID NO:2, and SEQ ID NO: 8, as set forth in FIGS. 8, 9,and 14, respectively. Two cDNA clones having the nucleotide sequences ofSEQ ID NO:1 and SEQ ID NO:2, and the genomic DNA having the nucleotidesequence of SEQ ID NO: 8, were isolated as described in Example 2.

[0048] Thus, in a particular embodiment, this invention provides anisolated nucleic acid molecule selected from the group consisting of (a)the polynucleotide sequence of SEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:8; (b) an isolated nucleic acid molecule that hybridizes to eitherstrand of a denatured, double-stranded DNA comprising the nucleic acidsequence of (a) under conditions of moderate stringency in 50% formamideand about 6×SSC at about 42° C. with washing conditions of approximately60° C., about 0.5×SSC, and about 0.1% SDS; (c) an isolated nucleic acidmolecule that hybridizes to either strand of a denatured,double-stranded DNA comprising the nucleic acid sequence of (a) underconditions of high stringency in 50% formamide and about 6×SSC, withwashing conditions of approximately 68° C., about 0.2×SSC, and about0.1% SDS; (d) an isolated nucleic acid molecule derived by in vitromutagenesis from SEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:8; (e) anisolated nucleic acid molecule degenerate from SEQ ID NO:1, SEQ ID NO:2,or SEQ ID NO:8 as a result of the genetic code; and (f) an isolatednucleic acid molecule selected from the group consisting of human PCGEM1DNA, an allelic variant of human PCGEM1 DNA, and a species homolog ofPCGEM1 DNA.

[0049] As used herein, conditions of moderate stringency can be readilydetermined by those having ordinary skill in the art based on, forexample, the length of the DNA. The basic conditions are set forth bySambrook et al. Molecular Cloning: A Laboratory Manual, 2d ed. Vol. 1,pp. 1.101-104, Cold Spring Harbor Laboratory Press, (1989), and includeuse of a prewashing solution for the nitrocellulose filters of about5×SSC, about 0.5% SDS, and about 1.0 mM EDTA (pH 8.0), hybridizationconditions of about 50% formamide, about 6×SSC at about 42° C. (or othersimilar hybridization solution, such as Stark's solution, in about 50%formrnamide at about 42° C.), and washing conditions of about 60° C.,about 0.5×SSC, and about 0.1% SDS. Conditions of high stringency canalso be readily determined by the skilled artisan based on, for example,the length of the DNA. Generally, such conditions are defined ashybridization conditions as above, and with washing at approximately 68°C., about 0.2×SSC, and about 0.1% SDS. The skilled artisan willrecognize that the temperature and wash solution salt concentration canbe adjusted as necessary according to factors such as the length of theprobe.

[0050] Additional Sequences

[0051] Due to the known degeneracy of the genetic code, wherein morethan one codon can encode the same amino acid, a DNA sequence can varyfrom that shown in SEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:8, and stillencode PCGEM1. Such variant DNA sequences can result from silentmutations (e.g., occurring during PCR amplification), or can be theproduct of deliberate mutagenesis of a native sequence.

[0052] The invention thus provides isolated DNA sequences of theinvention selected from: (a) DNA comprising the nucleotide sequence ofSEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:8; (b) DNA capable ofhybridization to a DNA of (a) under conditions of moderate stringency;(c) DNA capable of hybridization to a DNA of (a) under conditions ofhigh stringency; and (d) DNA which is degenerate as a result of thegenetic code to a DNA defined in (a), (b), or (c). Such sequences arepreferably provided and/or constructed in the form of an open readingframe uninterrupted by internal non-translated sequences, or introns,that are typically present in eukaryotic genes. Sequences ofnon-translated DNA can be present 5′ or 3′ from an open reading frame,where the same do not interfere with manipulation or expression of thecoding region. Of course, should PCGEM1 encode a polypeptide,polypeptides encoded by such DNA sequences are encompassed by theinvention. Conditions of moderate and high stringency are describedabove.

[0053] In another embodiment, the nucleic acid molecules of theinvention comprise nucleotide sequences that are at least 80% identicalto a nucleotide sequence set forth herein. Also contemplated areembodiments in which a nucleic acid molecule comprises a sequence thatis at least 90% identical, at least 95% identical, at least 98%identical, at least 99% identical, or at least 99.9% identical to anucleotide sequence set forth herein.

[0054] Percent identity may be determined by visual inspection andmathematical calculation. Alternatively, percent identity of two nucleicacid sequences may be determined by comparing sequence information usingthe GAP computer program, version 6.0 described by Devereux et al.(Nucl. Acids Res. 12:387, 1984) and available from the University ofWisconsin Genetics Computer Group (UWGCG). The preferred defaultparameters for the GAP program include: (1) a unary comparison matrix(containing a value of 1 for identities and 0 for non-identities) fornucleotides, and the weighted comparison matrix of Gribskov and Burgess,Nucl. Acids Res. 14:6745, 1986, as described by Schwartz and Dayhoff,eds., Atlas of Protein Sequence and Structure, National BiomedicalResearch Foundation, pp. 353-358, 1979; (2) a penalty of 3.0 for eachgap and an additional 0.10 penalty for each symbol in each gap; and (3)no penalty for end gaps. Other programs used by one skilled in the artof sequence comparison may also be used.

[0055] The invention also provides isolated nucleic acids useful in theproduction of polypeptides. Such polypeptides may be prepared by any ofa number of conventional techniques. A DNA sequence of this invention ordesired fragment thereof may be subcloned into an expression vector forproduction of the polypeptide or fragment. The DNA sequenceadvantageously is fused to a sequence encoding a suitable leader orsignal peptide. Alternatively, the desired fragment may be chemicallysynthesized using known techniques. DNA fragments also may be producedby restriction endonuclease digestion of a full length cloned DNAsequence, and isolated by electrophoresis on agarose gels. If necessary,oligonucleotides that reconstruct the 5′ or 3′ terminus to a desiredpoint may be ligated to a DNA fragment generated by restriction enzymedigestion. Such oligonucleotides may additionally contain a restrictionendonuclease cleavage site upstream of the desired coding sequence, andposition an initiation codon (ATG) at the N-terminus of the codingsequence.

[0056] The well-known polymerase chain reaction (PCR) procedure also maybe employed to isolate and amplify a DNA sequence encoding a desiredprotein fragment. Oligonucleotides that define the desired termini ofthe DNA fragment are employed as 5′ and 3′ primers. The oligonucleotidesmay additionally contain recognition sites for restrictionendonucleases, to facilitate insertion of the amplified DNA fragmentinto an expression vector. PCR techniques are described in Saiki et al.,Science 239:487 (1988); Recombinant DNA Methodology, Wu et al., eds.,Academic Press, Inc., San Diego (1989), pp. 189-196; and PCR Protocols:A Guide to Methods and Applications, Innis et al., eds., Academic Press,Inc. (1990).

[0057] Use of PCGEM1 Nucleic Acid or Oligonucleotides

[0058] In a particular embodiment, the invention relates to PCGEM1nucleotide sequences isolated from human prostate cells, including thecomplete genomic DNA (FIG. 14, SEQ ID NO: 8), and two full length cDNAs:SEQ ID NO:1 (FIG. 8) and SEQ ID NO:2 (FIG. 9), and fragments thereof.The nucleic acids of the invention, including DNA, RNA, mRNA andoligonucleotides thereof, are useful in a variety of applications in thedetection, diagnosis, prognosis, and treatment of prostate cancer.Examples of applications within the scope of the present inventioninclude, but are not limited to:

[0059] amplifying PCGEM1 sequences;

[0060] detecting a PCGEM1-derived marker of prostate cancer byhybridization with an oligonucleotide probe;

[0061] identifying chromosome 2;

[0062] mapping genes to chromosome 2;

[0063] identifying genes associated with certain diseases, syndromes, orother conditions associated with human chromosome 2;

[0064] constructing vectors having PCGEM1 sequences;

[0065] expressing vector-associated PCGEM1 sequences as RNA and protein;

[0066] detecting defective genes in an individual;

[0067] developing gene therapy;

[0068] developing immunologic reagents corresponding to PCGEM1-encodedproducts; and

[0069] treating prostate cancer using antibodies, antisense nucleicacids, or other inhibitors specific for PCGEM1 sequences.

[0070] Detecting, Diagnosing, and Treating Prostate Cancer

[0071] The present invention provides a method of detecting prostatecancer in a patient, which comprises (a) detecting PCGEM1 mRNA in abiological sample from the patient; and (b) correlating the amount ofPCGEM1 mRNA in the sample with the presence of prostate cancer in thepatient. Detecting PCGEM1 mRNA in a biological sample may include: (a)isolating RNA from said biological sample; (b) amplifying a PCGEM1 cDNAmolecule; (c) incubating the PCGEM1 cDNA with the isolated nucleic acidof the invention; and (d) detecting hybridization between the PCGEM1cDNA and the isolated nucleic acid. The biological sample can beselected from the group consisting of blood, urine, and tissue, forexample, from a biopsy. In a preferred embodiment, the biological sampleis blood. This method is useful in both the initial diagnosis ofprostate cancer, and the later prognosis of disease. This method allowsfor testing prostate tissue in a biopsy, and after removal of acancerous prostate, continued monitoring of the blood formicrometastases.

[0072] According to this method of diagnosing and prognosticatingprostate cancer in a patient, the amount of PCGEM1 mRNA in a biologicalsample from a patient is correlated with the presence of prostate cancerin the patient. Those of ordinary skill in the art can readily assessthe level of over-expression that is correlated with the presence ofprostate cancer.

[0073] In another embodiment, this invention provides a vector,comprising a PCGEM1 promoter sequence operatively linked to a nucleotidesequence encoding a cytotoxic protein. The invention further provides amethod of selectively killing a prostate cancer cell, which comprisesintroducing the vector to prostate cancer cells under conditionssufficient to permit selective killing of the prostate cells. As usedherein, the phrase “selective killing” is meant to include the killingof at least a cell which is specifically targeted by a nucleotidesequence. The putative PCGEM1 promoter, contained in the 5′ flankingregion of the PCGEM1 genomic sequence, SEQ ID NO: 3, is set forth inFIG. 11. Applicants envision that a nucleotide sequence encoding anycytotoxic protein can be incorporated into this vector for delivery toprostate tissue. For example, the cytotoxic protein can be ricin, abrin,diphtheria toxin, p53, thymidine kinase, tumor necrosis factor, choleratoxin, Pseudomonas aeruginosa exotoxin A, ribosomal inactivatingproteins, or mycotoxins such as trichothecenes, and derivatives andfragments (e.g., single chains) thereof.

[0074] This invention also provides a method of identifying anandrogen-responsive cell line, which comprises (a) obtaining a cell linesuspected of being androgen-responsive, (b) incubating the cell linewith an androgen; and (c) detecting PCGEM1 mRNA in the cell line,wherein an increase in PCGEM1 mRNA, as compared to an untreated cellline, correlates with the cell line being androgen-responsive.

[0075] The invention further provides a method of measuring theresponsiveness of a prostatic tissue to hormone-ablation therapy, whichcomprises (a) treating the prostatic tissue with hormone-ablationtherapy; and (b) measuring PCGEM1 mRNA in the prostatic tissue followinghormone-ablation therapy, wherein a decrease in PCGEM1 mRNA, as comparedto an untreated cell line, correlates with the cell line responding tohormone-ablation therapy.

[0076] In another aspect of the invention, these nucleic acid moleculesmay be introduced into a recombinant vector, such as a plasmid, cosmid,or virus, which can be used to transfect or transduce a host cell. Thenucleic acids of the present invention may be combined with other DNAsequences, such as promoters, polyadenylation signals, restrictionenzyme sites, multiple cloning sites, and other coding sequences.

[0077] Probes

[0078] Among the uses of nucleic acids of the invention is the use offragments as probes or primers. Such fragments generally comprise atleast about 17 contiguous nucleotides of a DNA sequence. The fragmentmay have fewer than 17 nucleotides, such as, for example, 10 or 15nucleotides. In other embodiments, a DNA fragment comprises at least 20,at least 30, or at least 60 contiguous nucleotides of a DNA sequence.Examples of probes or primers of the invention include those of SEQ IDNO: 5, SEQ ID NO: 6, and SEQ ID NO: 7, as well as those disclosed inTable I. TABLE I Starting Primer Sequence (5′→3′) S/AS Base # SEQ ID NO.p413 TGGCAACAGGCAAGCAGAG S 510 SEQ ID NO: 9 p414 GGCCAAAATAAAACCAAACATAS 610 SEQ ID NO: 10 p489 GCAAATATGATTTAAAGATACAAC S 752 SEQ ID NO: 11p490 GGTTGTATCTTTAAATCATATTTGC AS 776 SEQ ID NO: 12 p491ACTGTCTTTTCATATATTTCTCAATGC S 559 SEQ ID NO: 13 p517AAGTAGTAATTTTAAACATGGGAG AS 1516 SEQ ID NO: 14 p518TTTTTCAATTAGGCAGCAACC S 131 SEQ ID NO: 15 p519 GAATTGTCTTTGTGATTGTTTTTAGS 1338 SEQ ID NO: 16 p560 CAATTCACAAAGACAATTCAGTTAAG AS 1355 SEQ ID NO:17 p561 ACAATTAGACAATGTCCAGCTGA AS 1154 SEQ ID NO: 18 p562CTTTGGCTGATATCATGAAGTGTC AS 322 SEQ ID NO: 19 p623AACCTTTTGCCCTATGCCGTAAC S 148 SEQ ID NO: 20 p624 GAGACTCCCAACCTGATGATGTAS 376 SEQ ID NO: 21 p839 GGTCACGTTGAGTCCCAGTG AS 270 SEQ ID NO: 22

[0079] However, even larger probes may be used. For example, aparticularly preferred probe is derived from PCGEM1 (SEQ ID NO: 1) andcomprises nucleotides 116 to 1140 of that sequence. It has beendesignated SEQ ID NO: 4 and is set forth in FIG. 12.

[0080] When a hybridization probe binds to a target sequence, it forms aduplex molecule that is both stable and selective. These nucleic acidmolecules may be readily prepared, for example, by chemical synthesis orby recombinant techniques. A wide variety of methods are known in theart for detecting hybridization, including fluorescent, radioactive, orenzymatic means, or other ligands such as avidin/biotin.

[0081] In another aspect of the invention, these nucleic acid moleculesmay be introduced into a recombinant vector, such as a plasmid, cosmid,or virus, which can be used to transfect or transduce a host cell. Thenucleic acids of the present invention may be combined with other DNAsequences, such as promoters, polyadenylation. signals, restrictionenzyme sites, multiple cloning sites, and other coding sequences.

[0082] Because homologs of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 8from other mammalian species are contemplated herein, probes based onthe human DNA sequence of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 8may be used to screen cDNA libraries derived from other mammalianspecies, using conventional cross-species hybridization techniques.

[0083] In another aspect of the invention, one can use the knowledge ofthe genetic code in combination with the sequences set forth herein toprepare sets of degenerate oligonucleotides. Such oligonucleotides areuseful as primers, e.g., in polymerase chain reactions (PCR), wherebyDNA fragments are isolated and amplified. Particularly preferred primersare set forth in FIG. 13 and Table I and are designated SEQ ID NOS: 5-7and 9-22, respectively. A particularly preferred primer pair is p518(SEQ ID NO: 15) and p839 (SEQ ID NO: 22), which when used in PCR,preferentially amplifies mRNA, thereby avoiding less desirablecross-reactivity with genomic DNA.

[0084] Chromosome Mapping

[0085] As set forth in Example 3, the PCGEM1 gene has been mapped byfluorescent in situ hybridization to the 2q32 region of chromosome 2using a bacterial artificial chromosome (BAC) clone containing PCGEU1genomic sequence. Thus, all or a portion of the nucleic acid molecule ofSEQ ID NO:1, SEQ ID NO:2, and SEQ ID NO:8, including oligonucleotides,can be used by those skilled in the art using well-known techniques toidentify human chromosome 2, and the specific locus thereof, thatcontains the PCGEM1 DNA. Useful techniques include, but are not limitedto, using the nucleotide sequence of SEQ ID NO:1, SEQ ID NO:2, or SE IDNO:8, or fragments thereof, including oligonucleotides, as a probe invarious well-known techniques such as radiation hybrid mapping (highresolution), in situ hybridization to chromosome spreads (moderateresolution), and Southern blot hybridization to hybrid cell linescontaining individual human chromosomes (low resolution).

[0086] For example, chromosomes can be mapped by radiationhybridization. First, PCR is performed using the Whitehead Institute/MITCenter for Genome Research Genebridge4 panel of 93 radiation hybrids(http://www-genome.wi.mit.edu/ftp/distribution/human_STS_releases/july97/rhmap/genebridge4.html). Primers are usedwhich lie within a putative exon of the gene of interest and whichamplify a product from human genomic DNA, but do not amplify hamstergenomic DNA. The results of the PCRs are converted into a data vectorthat is submitted to the Whitehead/MIT Radiation Mapping site on theinternet (http://www-seq.wi.mit.edu). The data is scored and thechromosomal assignment and placement relative to known Sequence Tag Site(STS) markers on the radiation hybrid map is provided. (The followingweb site provides additional information about radiation hybrid mapping:http://www-genome.wi.mit.edu/ftp/distribution/human_STS_releases/july97/07-97.INTRO.html).

[0087] Identifying Associated Diseases

[0088] As noted above, PCGEM1 has been mapped to the 2q32 region ofchromosome 2. This region is associated with specific diseases, whichinclude but are not limited to diabetes mellitus (insulin dependent),and T cell leukemia/lymphoma. Thus, the nucleic acids of SEQ ID NO: 1,SEQ ID NO: 2, or SEQ ID NO:8; or fragments thereof, can be used by oneskilled in the art using well-known techniques to analyze abnormalitiesassociated with gene mapping to chromosome 2. This enables one todistinguish conditions in which this marker is rearranged or deleted. Inaddition, nucleotides of SEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:8, orfragments thereof, can be used as a positional marker to map other genesof unknown location.

[0089] The DNA may be used in developing treatments for any disordermediated (directly or indirectly) by defective, or insufficient amountsof PCGEM1, including prostate cancer. Disclosure herein of nativenucleotide sequences permits the detection of defective genes, and thereplacement thereof with normal genes. Defective genes may be detectedin in vitro diagnostic assays, and by comparison of a native nucleotidesequence disclosed herein with that of a gene derived from a personsuspected of harboring a defect in this gene.

[0090] Sense-Antisense

[0091] Other useful fragments of the nucleic acids include antisense orsense oligonucleotides comprising a single-stranded nucleic acidsequence (either RNA or DNA) capable of binding to target mRNA (sense)or DNA (antisense) sequences. Antisense or sense oligonucleotides,according to the present invention, comprise a fragment of DNA (SEQ IDNO:1, SEQ ID NO:2, or SEQ ID NO:8). Such a fragment generally comprisesat least about 14 nucleotides, preferably from about 14 to about 30nucleotides. The ability to derive an antisense or a senseoligonucleotide, based upon a cDNA sequence encoding a given protein isdescribed in, for example, Stein and Cohen (Cancer Res. 48:2659, 1988)and van der Krol et al. (BioTechniques 6:958, 1988).

[0092] The biologic activity of PCGEM1 in assay cells and the overexpression of PCGEM1 in prostate cancer tissues suggest that elevatedlevels of PCGEM1 promote prostate cancer cell growth. Thus, theantisense oligonucleotides to PCGEM1 may be used to reduce theexpression of PCGEM1 and, consequently, inhibit the growth of the cancercells.

[0093] Binding of antisense or sense oligonucleotides to target nucleicacid sequences results in the formation of duplexes. The antisenseoligonucleotides thus may be used to block expression of proteins or toinhibit the function of RNA. Antisense or sense oligonucleotides furthercomprise oligonucleotides having modified sugar-phosphodiester backbones(or other sugar linkages, such as those described in WO91/06629) andwherein such sugar linkages are resistant to endogenous nucleases. Sucholigonucleotides with resistant sugar linkages are stable in vivo (i.e.,capable of resisting enzymatic degradation) but retain sequencespecificity to be able to bind to target nucleotide sequences.

[0094] Other examples of sense or antisense oligonucleotides includethose oligonucleotides which are covalently linked to organic moieties,such as those described in WO 90/10448, and other moieties thatincreases affinity of the oligonucleotide for a target nucleic acidsequence, such as poly-(L-lysine). Further still, intercalating agents,such as ellipticine, and alkylating agents or metal complexes may beattached to sense or antisense oligonucleotides. Such modifications maymodify binding specificities of the antisense or sense oligonucleotidefor the target nucleotide sequence.

[0095] Antisense or sense oligonucleotides may be introduced into a cellcontaining the target nucleic acid sequence by any gene transfer method,including, for example, lipofection, CaPO₄-mediated DNA transfection,electroporation, or by using gene transfer vectors such as Epstein-Barrvirus or adenovirus.

[0096] Sense or antisense oligonucleotides also may be introduced into acell containing the target nucleotide sequence by formation of aconjugate with a ligand binding molecule, as described in WO 91/04753.Suitable ligand binding molecules include, but are not limited to, cellsurface receptors, growth factors, other cytokines, or other ligandsthat bind to cell surface receptors. Preferably, conjugation of theligand binding molecule does not substantially interfere with theability of the ligand binding molecule to bind to its correspondingmolecule or receptor, or block entry of the sense or antisenseoligonucleotide or its conjugated version into the cell.

[0097] Alternatively, a sense or an antisense oligonucleotide may beintroduced into a cell containing the target nucleic acid sequence byformation of an oligonucleotide-lipid complex, as described in WO90/10448. The sense or antisense oligonucleotide-lipid complex ispreferably dissociated within the cell by an endogenous lipase.

[0098] Polypeptides and Fragments Thereof

[0099] The invention also encompasses polypeptides and fragments thereofin various forms, including those that are naturally occurring orproduced through various techniques such as procedures involvingrecombinant DNA technology. Such forms include, but are not limited to,derivatives, variants, and oligomers, as well as fusion proteins orfragments thereof.

[0100] The polypeptides of the invention include full length proteinsencoded by the nucleic acid sequences set forth above. The polypeptidesof the invention may be membrane bound or they may be secreted and thussoluble. The invention also includes the expression, isolation andpurification of the polypeptides and fragments of the invention,accomplished by any suitable technique.

[0101] The following examples further illustrate preferred aspects ofthe invention.

EXAMPLE 1 Differential Gene Expression Analysis in Prostate Cancer

[0102] Using the differential display technique, we identified a novelgene that is over-expressed in prostate cancer cells. Differentialdisplay provides a method to separate and clone individual messengerRNAs by means of the polymerase chain reaction, as described in Liang etal., Science, 257:967-71 (1992), which is hereby incorporated byreference. Briefly, the method entails using two groups ofoligonucleotide primers. One group is designed to recognize thepolyadenylate tail of messenger RNAs. The other group contains primersthat are short and arbitrary in sequence and anneal to positions in themessenger RNA randomly distributed from the polyadenylate tail. Productsamplified with these primers can be differentiated on a sequencing gelbased on their size. If different cell populations are amplified withthe same groups of primers, one can compare the amplification productsto identify differentially expressed RNA sequences.

[0103] Differential display (“DD”) kits from Genomyx (Foster City,Calif.) were used to analyze differential gene expression. The steps ofthe differential display technique are summarized in FIG. 1.Histologically well defined matched tumor and normal prostate tissuesections containing approximately similar proportions of epithelialcells were chosen from individual prostate cancer patients.

[0104] Genomic DNA-free total RNA was extracted from this enriched poolof cells using RNAzol B (Tel-Test, Inc., Friendswood, Tex.) according tomanufacturer's protocol. The epithelial nature of the RNA source wasfurther confirmed using cytokeratin 18 expression (45) in reversetranscriptase-polymerase chain reaction (RT-PCR) assays. Using arbitraryand anchored primers containing 5′ M13 or T7 sequences (obtained fromBiomedical Instrumentation Center, Uniformed Services University of theHealth Sciences, Bethesda), the isolated DNA-free total RNA wasamplified by RT-PCR which was performed using ten anchored antisenseprimers and four arbitrary sense primers according to the protocolprovided by Hieroglyph™ RNA Profile Kit 1 (Genomyx Corpration, Calif.).The cDNA fragments produced by the RT-PCR assay were analyzed by highresolution gel electrophoresis, carried out by using Genomyx™ LR DNAsequencer and LR-Optimized™ HR-1000™ gel formulations (GenomyxCorporation, CA).

[0105] A partial DD screening of normal/tumor tissues revealed 30differentially expressed cDNA fragments, with 53% showing reduced or noexpression in tumor RNA specimens and 47% showing over expression intumor RNA specimen (FIG. 2). These cDNAs were excised from the DD gels,reamplified using T7 and M13 primers and the RT PCR conditionsrecommended in Hieroglyph™ RNA Profile Kit-1 (Genomyx Corp., CA), andsequenced. The inclusion of T7 and M13 sequencing primers in the DDprimers allowed rapid sequencing and orientation of cDNAs (FIG. 1).

[0106] All the reamplified cDNA fragments were purified byCentricon-c-100 system (Amicon, USA). The purified fragments weresequenced by cycle sequencing and DNA sequence determination using anABI 377 DNA sequencer. Isolated sequences were analyzed for sequencehomology with known sequences by running searches through publiclyavailable DNA sequence databases, including the National Center forBiotechnology Information and the Cancer Genome Anatomy Project.Approximately two-thirds of these cDNA sequences exhibited homology topreviously described DNA sequences/genes e.g., ribosomal proteins,mitochondrial DNA sequences, growth factor receptors, and genes involvedin maintaining the redox state in cells. About one-third of the cDNAsrepresented novel sequences, which did not exhibit similarity to thesequences available in publicly available databases. The PCGEM1fragment, obtained from the initial differential display screeningrepresents a 530 base pair (nucleotides 410 to 940 of SEQ ID NO: 1) cDNAsequence which, in initial searches, did not exhibit any significanthomology with sequences in the publicly available databases. Latersearching of the high throughput genome sequence (HTGS) databaserevealed perfect homology to a chromosome 2 derived uncharacterized,unfinished genomic sequence (accession #AC 013401).

EXAMPLE 2 Characterization of Full Length PCGEM1 cDNA Sequence

[0107] The full length of PCGEM1 was obtained by 5′ and 3′ RACE/PCR fromthe original 530 bp DD product (nucleotides 410 to 940 of PCGEM1 cDNASEQ ID NO: 1) using a normal prostate cDNA library in lambda phage(Clontech, CA). The RACE/PCR products were directly sequenced. Lasergeneand MacVector DNA analysis software were used to analyze DNA sequencesand to define open reading frame regions. We also used the original DDproduct to screen a normal prostate cDNA library. Three overlapping cDNAclones were identified.

[0108] Sequencing of the cDNA clones was performed on an ABI-310sequence analyzer and a new dRhodamine cycle sequencing kit (PE-AppliedBiosystem, CA). The longest PCGEM1 cDNA clone, SEQ ID NO:1 (FIG. 8),revealed 1643 nucleotides with a potential polyadenylation site, ATTAAA,close to the 3′ end followed by a poly (A) tail. As noted above,although initial searching of PCGEM1 gene in publically available DNAdatabases (e.g., National Center for Biotechnology Information) usingthe BLAST program did not reveal any homology, a recent search of theHTGS database revealed perfect homology of PCGEM1 (using cDNA of SEQ IDNO: 1) to a chromosome 2 derived uncharacterized, unfinished genomicsequence (accession #AC 013401). One of the cDNA clones, SEQ ID NO:2(FIG. 9), contained a 123 bp insertion at 278, and this insertedsequence showed strong homology (87%) to Alu sequence. It is likely thatthis clone represented the premature transcripts. Sequencing of severalclones from RT-PCR further confirmed the presence of the two forms oftranscripts.

[0109] Sequence analysis did not reveal any significant long openreading frame in both-strands. The longest ORF in the sense strand was105 nucleotides (572-679) encoding 35 amino acid peptides. However, theATG was not in a strong context of initiation. Although we could notrule out the coding capacity for a very small peptide, it is possiblethat PCGEM1 may function as a non-coding RNA.

[0110] The sequence of PCGEM1 cDNA has been verified by severalapproaches including characterization of several clones of PCGEM1 andanalysis of PCGEM1 cDNAs amplified from normal prostate tissue andprostate cancer cell lines. We have also obtained the genomic clones ofPCGEM1, which has helped to confirm the PCGEM1 cDNA sequence. Thecomplete genomic DNA sequence of PCGEM1 (SEQ ID NO:8) is shown in FIG.14. In FIG. 14 (and in the accompanying Sequence Listing), “Y”represents any one of the four nucleotide bases, cylosine, thymine,adenine, or guanine. Comparison of the cDNA and genomic sequencesrevealed the organization of the PCGEM1 transcription unit from threeexons (FIG. 15: E, Exon; B: BamHI; H: HindIII; X: XbaI; R: EcoRI).

EXAMPLE 3 Mapping the Location of PCGEM1

[0111] Using fluorescent in situ hybridization and the PCGEM1 genomicDNA as a probe, we mapped the location of PCGEM1 on chromosome 2q tospecific region 2q32 (FIG. 7A). Specifically, a Bacterial ArtificialChromosome (BAC) clone containing the PCGEM1 genomic sequence wasisolated by custom services of Genome Systems (St. Louis, Mo.).PCGEM1-Bac clone 1 DNA was nick translated using spectrum orange (Vysis)as a direct label and flourescent in situ hybridization was done usingthis probe on normal human male metaphase chromosome spreads.Counterstaining was done and chromosomal localization,was determinedbased on the G-band analysis of inverted 4′,6-diamidino-2-phenylindole(DAPI) images. (FIG. 7B: a DAPI counter-stained chromosome 2 is shown onthe left; an inverted DAPI stained chromosome 2 shown as G-bands isshown in the center; an ideogram of chromosome 2 showing thelocalization of the signal to band 2q32(bar) is shown on the right.)NU200 image acquisition and registration software was used to create thedigital images. More than 20 metaphases were analyzed.

EXAMPLE 4 Analysis of PCGEM1 Gene Expression in Prostate Cancer

[0112] To further characterize the tumor specific expression of the.PCGEM1 fragment, and also to rule out individual variations of geneexpression alterations commonly observed in tumors, the expression ofthe PCGEM1 fragment was evaluated on a test panel of matched tumor andnormal RNAs derived from the microdissected tissues of twenty prostatecancer patients.

[0113] Using the PCGEM1 cDNA sequence (SEQ ID NO:1), specific PCRprimers (Sense primer 1 (SEQ ID NO: 5): 5′ TGCCTCAGCCTCCCAAGTAAC 3 ′ andAntisense primer 2 (SEQ ID NO: 6): 5′ GGCCAAAATAAAACCAAACAT 3′) weredesigned for RT-PCR assays. Radical prostatectomy derived OCT compound(Miles Inc. Elkhart, Ind.) embedded fresh frozen normal and tumortissues from prostate cancer patients were characterized forhistopathology by examining hematoxylin and eosin stained sections (46).Tumor and normal prostate tissues regions representing approximatelyequal number of epithelial cells were dissected out of frozen sections.DNA-free RNA was prepared from these tissues and used in RT-PCR analysisto detect PCGEM1 expression. One hundred nanograms of total RNA wasreverse transcribed into cDNA using RT-PCR kit (Perkin-Elmer, Foster,Calif.). The PCR was performed using Amplitaq Gold from Perkin-Elmer(Foster, Calif.). PCR cycles used were: 95° C. for 10 minutes, 1 cycle;95° C. for 30 seconds, 55° C. for 30 seconds, 72° C. for 30 seconds, 42cycles, and 72° C. for 5 minutes, 1 cycle followed by a 4° C. storage.Epithelial cell-associated cytokeratin 18 was used as an internalcontrol.

[0114] RT-PCR analysis of microdissected matched normal and tumor tissuederived RNAs from 23 CaP patients revealed tumor associatedoverexpression of PCGEM1 in 13 (56%) of the patients (FIG. 5). Six oftwenty-three (26%) patients did not exhibit detectable PCGEM1 expressionin either normal or tumor tissue. derived RNAs. Three of twenty-three(13%) tumor specimens showed reduced expression in tumors. One of thepatients did not exhibit any change. Expression of housekeeping genes,cytokeratin-18 (FIG. 3) and glyceraldehyde-3-phosphate dehydrogenase(GAPDH) (data not shown) remained constant in tumor and normal specimensof all the patients (FIG. 3). These results were further confirmed byanother set of PCGEM1 specific primers (Sense Primer 3 (SEQ ID NO: 7):5′ TGGCAACAGGCAAGCAGAG 3′ and Antisense Primer 2 (SEQ ID NO: 6): 5′GGCCAAAATAAAACCAAACAT 3′). Four of 16 (25%) patients did not exhibitdetectable PCGEM1 expression in either normal or tumor tissue derivedRNAs. Two of 16 (12.5%) tumor specimens showed reduced expression intumors. These results of PCGEM1 expression in tumor tissues could beexplained by the expected individual variations between tumors ofdifferent patients. Most importantly, initial DD observations wereconfirmed by showing that 45% of patients analyzed did exhibit overexpression of PCGEM1 in tumor prostate tissues when compared tocorresponding normal prostate tissue of the same individual.

EXAMPLE 5 In situ Hybridization

[0115] In situ hybridization was performed essentially as described byWilkinson and Green (48). Briefly, OCT embedded tissue slides stored at−80° C. were fixed in 4% PFA (paraformaldehyde), digested withproteinase K and then again fixed in 4% PFA. After washing in PBS,sections were treated with 0.25% acetic anhydride in 0.1Mtriethanolamine, washed again in PBS, and dehydrated in a graded ethanolseries. Sections were hybridized with ³⁵S-labeled riboprobes at 52° C.overnight. After washing and RNase A treatment, sections weredehydrated, dipped into NTB-2 emulsion and exposed for 11 days at 4° C.After development, slides were lightly stained with hematoxylin andmounted for microscopy. In each section, PCGEM1 expression was scored aspercentage of cells showing ³⁵S signal: 1+, 1-25%; 2+, 25-50%; 3+,50-75%, 4+, 75-100%.

[0116] Paired normal (benign) and tumor specimens from 13 patients weretested using in situ hybridization. A representative example is shown inFIG. 17. In 11 cases (84%) tumor associated elevation of PCGEM1expression was detected. In 5 of these 11 patients the expression ofPCGEM1 increased to 1+ in the tumor area from an essentiallyundetectable level in the normal area (on the 0 to 4+ scale). Tumorspecimens from 4 of 11 patients scored between 2+ (example shown in FIG.17B) and 4+. Two of 11 patients showed focal signals with 3+ score inthe tumor area, and one of these patients had similar focal signal (2+)in an area pathologically designated as benign. In the remaining 2 ofthe 13 cases there was no detectable signal in any of the tissue areastested. The results indicate that PCGEM1 expression appears to berestricted to glandular epithelial cells. (FIG. 17 shows an example ofin situ hybridization of ³⁵S labeled PCGEM1 riboprobe to matched normal(A) versus tumor (B) sections of prostate cancer patients. The lightgray areas are hematoxylin stained cell bodies, the black dots representthe PCGEM1 expression signal. The signal is background level in thenormal (A), 2+ level in the tumor (B) section. The magnification is40×.)

EXAMPLE 6 PCGEM1 Gene Expression in Prostate Tumor Cell Lines

[0117] PCGEM1 gene expression was also evaluated in established prostatecancer cell lines: LNCaP, DU145, PC3 (all from ATCC), DuPro (availablefrom Dr. David Paulson, Duke University, Durham, N.C.), and anE6/E7—immortalized primary prostate cancer cell line, CPDR1 (47). CPDR1is a primary CaP derived cell line immortalized by retroviral vector,LXSN 16 E6 E7, expressing E6 and E7 gene of the human papilloma virus16. LNCaP is a well studied, androgen-responsive prostate cancer cellline, whereas DU145, PC3, DuPro and CPDR1 are androgen-independent andlack detectable expression of the androgen receptor. Utilizing theRT-PCR assay described above, PCGEM1 expression was easily detectable inLNCaP (FIG. 4). However, PCGEM1 expression was not detected in prostatecancer cell lines DU145, PC3, DuPro and CPDR. Thus, PCGEM1 was expressedin the androgen-responsive cell line but not in the androgen-independentcell lines. These results indicate that hormones, particularly androgen,may play a key role in regulating PCGEM1 expression in prostate cancercells. In addition, the results suggest that PCGEM1 expression may beused to distinguish between hormone responsive tumor cells and moreaggressive hormone refractory tumor cells.

[0118] To test if PCGEM1 expression is regulated by androgens, weperformed experiments evaluating PCGEM1 expression in LNCaP cells (ATCC)cultured with and without androgens. Total RNA from LNCaP cells, treatedwith synthetic androgen R1881 obtained from (DUPONT, Boston, Mass.),were analyzed for. PCGEM1 expression. Both RT-PCR analysis (FIG. 5a) andNorthern blot analysis (FIG. 5b) were conducted as follows.

[0119] LNCaP cells were maintained in RPMI 1640 (Life Technologies,Inc., Gaithersburg, Md.) supplemented with 10% fetal bovine serum (FBS,Life Technologies, Inc., Gaithersburg, Md.) and experiments wereperformed on cells between passages 20 and 35. For the studies of NKX3.1gene expression regulation, charcoal/dextran stripped androgen-free FBS(cFBS, Gemini Bio-Products, Inc., Calabasas, Calif.) was used. LNCaPcells were cultured first in RPMI 1640 with 10% cFBS for 4 days and thenstimulated with a non-metabolizable androgen analog R1881 (DUPONT,Boston, Mass.) at different concentrations for different times as shownin FIG. 5A. LNCaP cells identically treated but without R1881 served ascontrol. Poly A+ RNA derived from cells treated with/without R1881 wasextracted at indicated time points with RNAzol B (Tel-Test, Inc, TX) andfractionated (2 μg/lane) by running on 1% formaldehyde-agarose gel andtransferred to nylon membrane. Northern blots were analyzed for theexpression of PCGEM1 using the nucleic acid molecule set forth in SEQ IDNO: 4 as a probe. The RNA from LNCaP cells treated with R1881 and RNAfrom control LNCaP cells were also analyzed by RT-PCR assays asdescribed in Example 4.

[0120] As set forth in FIGS. 5a and 5 b, PCGEM1 expression increases inresponse to androgen treatment. This finding further supports thehypothesis that the PCGEM1 expression is regulated by androgens inprostate cancer cells.

EXAMPLE 7 Tissue Specificity of PCGEM1 Expression

[0121] Multiple tissue Northern blots (Clontech, CA) conducted accordingto the manufacturer's directions revealed prostate tissue-specificexpression of PCGEM1. Polyadenylate RNAs of 23 different human tissues(heart, brain, placenta, lung, liver skeletal muscle, kidney, pancreas,spleen, thymus, prostate, testis, ovary, small intestine, colon,peripheral blood, stomach, thyroid, spinal cord, lymph node, trachea,adrenal gland and bone marrow) were probed with the 530 base pair PCGEM1cDNA fragment (nucleotides 410 to 940 of SEQ ID NO: 1). A 1.7 kilobasemRNA transcript hybridized to the PCGEM1 probe in prostate tissue (FIG.6a). Hybridization was not observed in any of the other human tissues(FIG. 6a). Two independent experiments revealed identical results.

[0122] Additional Northern blot analyses on an RNA master blot(Clontech, CA) conducted according to the manufacturer's directionsconfirm the prostate tissue specificity of the PCGEM1 gene (FIG. 6b).Northern blot analyses reveal that the prostate tissue specificity ofPCGEM1 is comparable to the well known prostate marker PSA (77 mer oligoprobe) and far better than two other prostate specific genes PSMA (234bp fragment from PCR product) and NKX3.1 (210 bp cDNA). For instance,PSMA is expressed in the brain (37) and in the duodenal mucosa and asubset of proximal renal tubules (38). While NKX3.1 exhibits high levelsof expression in adult prostate, it is also expressed in lower levels intestis tissue and several other tissues (39).

EXAMPLE 8 Biologic Functions of the PCGEM1

[0123] The tumor associated PCGEM1 overexpression suggested that theincreased expression of PCGEM1 may favor tumor cell proliferation.NIH3T3 cells have been extensively used to define cell growth promotingfunctions associated with a wide variety of genes (40-44). UtilizingpcDNA3.1/Hygro(+/−)(Invitrogen, CA), PCGEM1 expression vectors wereconstructed in sense and anti-sense orientations and were transfectedinto NIH3T3 cells, and hygromycin resistant colonies were counted 2-3weeks later. Cells transfected with PCGEM1 sense construct formed about2 times more colonies than vector alone in three independent experiments(FIG. 10). The size of the colonies in PCGEM1 sense constructtransfected cells were significantly larger. No appreciable differencewas observed in the number of colonies between anti-sense PCGEM1constructs and vector controls. These promising results document a cellgrowth promoting/cell survival function(s) associated with PCGEM1.

[0124] The function of PCGEM1, however, does not appear to be due toprotein expression. To assess this hypothesis, we used the TestCodeprogram (GCG Wisconsin Package, Madison, Wis.), which identifiespotential protein coding sequences of longer than 200 bases by measuringthe non-randomness of the composition at every third base, independentlyfrom the reading frames. Analysis of the PCGEM1 cDNA sequence revealedthat, at greater than 95% confidence level, the sequence does notcontain any region with protein coding capacity (FIG. 16A). Similarresults were obtained when various published non-coding RNA sequenceswere analyzed with the TestCode program (data not shown), while knownprotein coding regions of similar size i.e., alpha actin (FIG. 16B) canbe detected with high fidelity. (In FIG. 16, evaluation of the codingcapacity of the PCGEM1 (A) and the human alpha actin (B), is performedindependently from the reading frame, by using the TestCode program. Thenumber of base pairs is indicated on the X-axis, the TestCode values areshown on the Y-axis. Regions of longer than 200 base pairs above theupper line (at 9.5 value) are considered coding, under the lower line(at 7.3 value) are considered non-coding, at a confidence level greaterthan 95%.)

[0125] The Codon Preference program (GCG Wisconsin Package, Madison,Wis.), which locates protein coding regions in a reading frame specificmanner further suggested the absence of protein coding capacity in thePCGEM1 gene (see www.cpdr.org). In vitro transcription/translation ofPCGEM1 cDNA did not produce a detectable protein/peptide. Although wecan not unequivocally rule out the possibility that PCGEM1 codes for ashort unstable peptide, at this time both experimental and computationalapproaches strongly suggest that PCGEM1 cDNA does not have proteincoding capacity. (It should be recognized that conclusions regarding therole of PCGEM1 are speculative in nature, and should not be consideredlimiting in any way.

[0126] The most intriguing aspect of PCGEM1 characterization has beenits apparent lack of protein coding capacity. Although we have notcompletely ruled out the possibility that PCGEM1 codes for a shortunstable peptide, careful sequencing of PCGEM1 cDNA and genomic clones,computational analysis of PCGEM1 sequence, and in vitrotranscription/translation experiments (data not shown) strongly suggesta non-coding nature of PCGEM1. It is interesting to note that anemerging group of novel mRNA-like non-coding RNAs are being discoveredwhose function and mechanisms of action remain poorly understood (49).Such RNA molecules have also been termed as “RNA riboregulators” becauseof their function(s) in development, differentiation, DNA damage, heatshock responses and tumorigenesis (40-42, 50). In the context oftumorigenesis, the H19, His-1 and Bic genes code for functionalnon-coding mRNAs (50). In addition, a recently reported prostate cancerassociated gene, DD3 also appears to exhibit a tissue specificnon-coding mRNA (51). In this regard it is important to point out thatPCGEM1 and DD3 may represent a new class of prostate specific genes. Therecent discovery of a steroid receptor co-activator as an mRNA, lackingprotein coding capacity further emphasizes the role of RNAriboregulators in critical biochemical function(s) (52). Our preliminaryresults showed that PCGEM1 expression in NIH3T3 cells caused asignificant increase in the size of colonies in a colony forming assayand suggests that PCGEM1 cDNA confers cell proliferation and/or cellsurvival function(s). Elevated expression of PCGEM1 in prostate cancercells may represent a gain in function favoring tumor cellproliferation/survival. On the basis of our first characterization ofPCGEM1 gene, we propose that PCGEM1 belongs to a novel class of prostatetissue specific genes with potential functions in prostate cell biologyand the tumorigenesis of the prostate gland.

[0127] In summary, utilizing surgical specimens and rapid differentialdisplay technology, we have identified candidate genes of interest withdifferential expression profile in prostate cancer specimens. Inparticular, we have identified a novel nucleotide sequence, PCGEM1, withno match in the publicly available DNA databases (except for thehomology shown in the high throughput genome sequence database,discussed above). A PCGEM1 cDNA fragment detected a 1.7 kb mRNA onNorthern blots with selective expression in prostate tissue.Furthermore, this gene was found to be up-regulated by the syntheticandrogen, R1881. Careful analysis of microdissected matched tumor andnormal tissues further revealed PCGEM1 over-expression in a significantpercentage of prostate cancer specimens. Thus, we have provided a genewith broad implications for the diagnosis, prevention, and treatment ofprostate cancer.

[0128] The specification is most thoroughly understood in light of theteachings of the references cited within the specification which arehereby incorporated by reference. The embodiments within thespecification provide an illustration of embodiments of the inventionand should not be construed to limit the scope of the invention. Theskilled artisan readily recognizes that many other embodiments areencompassed by the invention.

[0129] References

[0130] 1. Parker S L, Tong T, Bolden S, and Wingo P A: Cancerstatistics. CA Cancer J. Clin., 46:5-27, 1996.

[0131] 2. Visakorpi T, Kallioniemi O P, Koivula T and Isola J: Newprognostic factors in prostate carcinoma. Eur. Uro., 24:438-449, 1993.

[0132] 3. Mostofi F K: Grading of prostate carcinoma. Cancer ChemotheraRep, 59:111, 1975.

[0133] 4. Lu-Yao G L, McLerran D, Wasson J, Wennberg J E: An assessmentof radical prostatectomy. Time trends, Geographical Variations andOutcomes. JAMA, 269:2633-2636, 1993.

[0134] 5. Partin A W and Oesterling J E: The clinical usefulness ofprostate-specific antigen: update 1994,J. Urol., 152:1358-1368, 1994.

[0135] 6. Wasson J H, Cushman C C, Bruskewitz R C, Littenberg B, MulleyA G, and Wennberg J E: A structured literature review of treatment forlocalized CaP. Arch. Fam. Med., 2:487-493, 1993.

[0136] 7. Weinberg R A: How cancer arises. Sci. Amer., 9, 62-70, 1996.

[0137] 8. Bostwick D G: High grade prostatic intraepithelial neoplasia:The most likely precursor of prostate cancer. Cancer, 75:1823-1836,1995.

[0138] 9. Bostwick D G, Pacelli A, Lopez-Beltran A: Molecular Biology ofProstatic Intraepithelial Neoplasia. The Prostate, 29:117-134, 1996.

[0139] 10. Pannek J, Partin A W: Prostate specific antigen: What's newin 1997. Oncology, 11:1273-1278, 1997.

[0140] 11. Partin A W, Kattan M W, Subong E N, Walsh P C, Wojno K J,Oesterling J E, Scardino P T, Pearson J D: Combination of prostatespecific antigen, clinical stage, and Gleason score to predictpathological stage of localized prostate cancer. A multi-institutionalupdate. JAMA, 277:1445-1451, 1997.

[0141] 12. Gomella L G, Raj G V, Moreno J G: Reverse transcriptasepolymerase chain reaction for prostate specific antigen in management ofprostate cancer. J. Urol., 158:326-337, 1997.

[0142] 13. Gao C L, Dean R C, Pinto A, Mooneyhan R, Connelly R R, McLeodD G, Srivastava, S, Moul J W: Detection of PSA-expressing prostaticcells in bone marrow of radical prostatectomy patients by sensitivereverse transcriptase-polymerase chain reaction (RT-PCR). 1998International Symposium on Biology of Prostate growth, NationalInstitutes of Health, p. 83, 1998.

[0143] 14. Garnick M B, Fair W R: Prostate cancer. Sci. Amer., 75-83,1998.

[0144] 15. Moul J W, Gaddipati J, and Srivastava S: 1994. Molecularbiology of CaP. Oncogenes and tumor suppressor genes. Current ClinicalOncology: CaP. (Eds. Dawson, N. A. and Vogelzang, N. J.), Wiley-LissPublications, 19-46.

[0145] 16. Lalani E-N, Laniado M E and Abel P D: Molecular and cellularbiology of prostate cancer. Cancer and Mets. Rev. 16:29-66, 1997.

[0146] 17. Shi X B, Gumerlock P H, deVere White R W: Molecular Biologyof CaP World J. Urol; 14, 318-328, 1996.

[0147] 18. Heidenberg H B, Bauer J J, McLeod D G, Moul J W andSrivastava S: The role of p53 tumor suppressor gene in CaP: a possiblebiomarker? Urology, 48:971-979, 1996.

[0148] 19. Bova G S and Issacs W B: Review of allelic loss and gain inprostate cancer. World J Urol., 14:338-346, 1996.

[0149] 20. Issacs W B and Bova G S: Prostate Cancer: The Genetic Basisof Human Cancer. Eds. Vogelstein B, and Kinzler K W, McGraw-HillCompanies, Inc., pp. 653-660, 1998.

[0150] 21. Srivastava S and Moul J W: Molecular Progression of ProstateCancer. Advances in Oncobiology. (In Press) 1998.

[0151] 22. Sakr W A, Macoska J A, Benson P, Benson D J, Wolman S R,Pontes J E, and Crissman: Allelic loss in locally metastatic,multi-sampled prostate cancer. Cancer Res., 54:3273-3277, 1994.

[0152] 23. Mirchandani D, Zheng J, Miller G L, Ghosh A K, Shibata D K,Cote R J and Roy-Burman P: Heterogeneity in intratumor distribution ofp53 mutations in human prostate cancer. Am. J. Path. 147:92-101, 1995.

[0153] 24. Bauer J J, Moul J W, and McLeod D G: CaP: Diagnosis,treatment, and experience at one tertiary medical center, 1989-1994.Military Medicine, 161:646-653, 1996.

[0154] 25. Bauer J J, Connelly R R, Sesterhenn I A, Bettencourt M C,McLeod D G, Srivastava S, Moul J W: Biostatistical modeling usingtraditional variables and genetic biomarkers predicting the risk ofprostate cancer recurrence after radical prostatectomy. Cancer,79:952-962, 1997.

[0155] 26. Bauer J J, Connelly R R, Sesterhenn I A, DeAusen J D, McLeodD G, Srivastava S, Moul J W: Biostatistical modeling using traditionalpreoperative and pathological prognostic variables in the selection ofmen at high risk of disease recurrence after radical prostatectomy. J.Urol., 159(3):929-933, 1998

[0156] 27. Sager R: Expression genetics in cancer: Shifting the focusfrom DNA to RNA. Proc Natl. Acad Sci. USA, 94:952-957, 1997

[0157] 28. Strausberg R L, Dahl C A, and Klausner R D: New opportunitiesfor uncovering the molecular basis of cancer. Nature Genetics,15:415-16, 1997.

[0158] 29. Liang, Peng, and Pardec A B: Differential display ofeukaryotic messenger RNA by means of the polymerase chain reaction.Science 257:967-971, 1992.

[0159] 30. Velculescu V E, Zhang L, Vogelstein B, and Kinzler K W:Serial analysis of gene expression Science, 270:484-487, 1995.

[0160] 31. Chena M, Shalon D S, Davis R W, and Brown P O: Quantitativemonitoring of gene expression patterns with a complementary DNAmicroarrays. Science, 270:467-470, 1995.

[0161] 32. Liu A Y, Corey E, Vessella R L, Lange P H, True L D, Huang GM, Nelson P S and Hood L: Identification of differentially expressedprostate genes: Increased expression of transcription factor ETS-2 inprostate cancer. The Prostate 30:145-153, 1997.

[0162] 33. Chuaqui R F, Englert C R, Strup S E, Vocke C D, Zhuang Z,Duray P H, Bostwick D G, Linehan W M, Liotta L A and Emmert-Buck M R:Identification of a novel transcript up-regulated in a clinicallyaggressive prostate carcinoma. Urology, 50:302-307, 1997.

[0163] 34. Thigpen A E, Cala K M, Guileyardo J M, Molberg K H, McConnellJ D, and Russell D W: Increased expression of early growth response-1messenger ribonucleic acid in prostate adenocarcinoma. J. Urol.,155:975-981, 1996.

[0164] 35. Wang F L, Wang Y, Wong W K, Liu Y, Addivinola F J, Liang P,Chen L B, Kantoff P W and Pardee A B: Two differentially expressed genesin normal human prostate tissues and in carcinoma. Cancer Res.,56:3634-3637, 1996.

[0165] 36. Schleicher R L, Hunter S B, Zhang M, Zheng M, Tan W, Bandea CI, Fallon M T, Bostwick D G, and Varma V A: Neurofilament heavy,chain-like messenger RNA and protein are present in benign prostate anddown regulated in prostate carcinoma. Cancer Res., 57:3532-3536, 1997.

[0166] 37. O'Keefe, D S, Su, S L, Bacich D J, Horiguchi Y, Luo Y, PowellC T, Zandvliet D, Russell P J, Molloy P L, Nowak, N J, Shows, T B,Mullins, C, Vonder Haar R A, Fair W R, and Heston W D: Mapping, genomicorganization and promoter analysis of the human prostate-specificmembrane antigen gene. Biochim Biophys Acta, 1443(1-2):113-127, 1998.

[0167] 38. Silver D A, Pellicer I, Fair W R, Heston, W D, andCordon-Cardo C: Prostate-specific membrane antigen expression in normaland malignant human tissues. Clin Cancer Res, 3(1):81-85, 1997.

[0168] 39. He W W, Sciavolino P J, Wing J, Augustus M, Hudson P,Meissner P S, Curtis R T, Shell B K, Bostwick D G, Tindall D J, GelmannE P, Abate-Shen C, and Carter K C: A novel prostate-specific,androgen-regulated homeobox gene (NKX3.1) that maps to 8p21, a regionfrequently deleted in prostate cancer. Genomics 43(1):69-77, 1997.

[0169] 40. Crespi M D, Jurkevitch E, Poiret M, d'Aubenton-Carafa Y,Petrovics G, Kondorosi E, and Kondorosi A: Enod 40, a gene expressedduring nodule organogenesis, codes for a non-translatable RNA involvedin plant growth. The EMBO J 13:5099-5112, 1994.

[0170] 41. Velleca M A, Wallace M C and Merlie J P: A novelsynapse-associated non-coding RNA. Mol. Cell Bio. 14:7095-7104, 1994.

[0171] 42. Takeda K. Ichijoh, Fujii M, Mochida Y, Saitoh M, Nishitoh H,Sampath T K and Miyazonok: Identification of a novel bone morphogeneticprotein responsive gene that may function as non-coding RNA. J. Biol.Chem. 273:17079-17085, 1998.

[0172] 43. Van de Sande K, Pawlowski K, Czaja I, Wieneke U, et al:Modification of phytohormone response by a peptide encode by ENOD 40 oflegumes and a non-legume. Science 273:370-373.

[0173] 44. Hao Y, Crenshaw T, Moulton T, Newcomb E and Tycko B: Tumorsuppressor activity of H19RNA. Nature. 365:764-767, 1993.

[0174] 45. Neumaier M, Gerhard M, Wagener C: Diagnosis ofmicrometastases by the amplification of tissue specific genes. Gene.159(1):43-47, 1995.

[0175] 46. Gaddipati J, McLeod D, Sesterhenn I, Hussussian C, Tong Y,Seth P, Dracopoli N, Moul J and Srivastava S: Mutations of the p16 geneproduct are rare in prostate cancer. The Prostate. 30:188-194, 1997.

[0176] 47. Davis L D, Sesterhenn I A, Moul J W and Srivastava S:Characterization of prostate cancer cells immortalized with E6/E7 genes.Int. Symp. On Biol. Of Prost. Growth Proceedings, National Institutes ofHealth., 77, 1998.

[0177] 48. Wilkinson, D., & Green, J. (1990) in Post implantationMammalian Embryos, eds. Copp, A. J. & Cokroft, D. L. (Oxford UniversityPress, London), pp. 155-171.

[0178] 49. Erdmann, V. A., Szymanski, M., Hochberg, A., de Groot, N., &Barciszewski, J. (1999) Nucleic Acids Research 27, 192-195.

[0179] 50. Askew, D. S., & Xu, F. (1999) Histol Histopatho. 14,235-241.

[0180] 51. Bussemakers, M. J. H., Van Bokhoven, A., Verhaegh, G. W.,Smit, F. P., Karthaus, H. F., Schalken, J. A., Debruyne, F. M., Ru, N.,& Isaacs, W. B. (1999) Cancer Res. 59, 5975-5979.

[0181] 52. Lanz, R. B., McKenna, N. J., Onate, S. A., Albrecht, U.,Wong, J., Tsai, S. Y., Tsai, M. J., & O'Mally, B. W. (1999) Cell 97,17-27.

[0182] 53. Srikantan V, Zou Z, Petrovics G, Xu L, Augustus M, Davis L,Livezey J R, Connell T, Sesterhenn I A, Yoshino K, Buzard G S, Mostofi FK, McLeod D G, Moul J W, and Srivastava S: PCGEM1: A Novel ProstateSpecific Gene is Overexpressed in Prostate Cancer. Submitted toProceedings of the National Academy of Sciences.

1 22 1 1603 DNA Homo sapiens 1 aaggcactct ggcacccagt tttggaactgcagttttaaa agtcataaat tgaatgaaaa 60 tgatagcaaa ggtggaggtt tttaaagagctatttatagg tccctggaca gcatcttttt 120 tcaattaggc agcaaccttt ttgccctatgccgtaacctg tgtctgcaac ttcctctaat 180 tgggaaatag ttaagcagat tcatagagctgaatgataaa attgtactac gagatgcact 240 gggactcaac gtgaccttat caagtgagcaggcttggtgc atttgacact tcatgatatc 300 agccaaagtg gaactaaaaa cagctcctggaagaggacta tgacatcatc aggttgggag 360 tctccaggga cagcggaccc tttggaaaaggactagaaag tgtgaaatct attagtcttc 420 gatatgaaat tctctgtctc tgtaaaagcatttcatattt acaagacaca ggcctactcc 480 tagggcagca aaaagtggca acaggcaagcagagggaaaa gagatcatga ggcatttcag 540 agtgcactgt cttttcatat atttctcaatgccgtatgtt tggttttatt ttggccaagc 600 ataacaatct gctcaagaaa aaaaaatctggagaaaacaa aggtgccttt gccaatgtta 660 tgtttctttt tgacaagccc tgagatttctgaggggaatt cacataaatg ggatcaggtc 720 attcatttac gttgtgtgca aatatgatttaaagatacaa cctttgcaga gagcatgctt 780 tcctaagggt aggcacgtgg aggactaagggtaaagcatt cttcaagatc agttaatcaa 840 gaaaggtgct ctttgcattc tgaaatgcccttgttgcaaa tattggttat attgattaaa 900 tttacactta atggaaacaa cctttaacttacagatgaac aaacccacaa aagcaaaaaa 960 tcaaaagccc tacctatgat ttcatattttctgtgtaact ggattaaagg attcctgctt 1020 gcttttgggc ataaatgata atggaatatttccaggtatt gtttaaaatg agggcccatc 1080 tacaaattct tagcaatact ttggataattctaaaattca gctggacatt gtctaattgt 1140 tttttatata catctttgct agaatttcaaattttaagta tgtgaattta gttaattagc 1200 tgtgctgatc aattcaaaaa cattactttcctaaatttta gactatgaag gtcataaatt 1260 caacaaatat atctacacat acaattatagattgtttttc attataatgt cttcatctta 1320 acagaattgt ctttgtgatt gtttttagaaaactgagagt tttaattcat aattacttga 1380 tcaaaaaatt gtgggaacaa tccagcattaattgtatgtg attgttttta tgtacataag 1440 gagtcttaag cttggtgcct tgaagtcttttgtacttagt cccatgttta aaattactac 1500 tttatatcta aagcatttat gtttttcaattcaatttaca tgatgctaat tatggcaatt 1560 ataacaaata ttaaagattt cgaaatagaaaaaaaaaaaa aaa 1603 2 1579 DNA Homo sapiens 2 gcggccgcgt cgacgcaacttcctctaatt gggaaatagt taagcagatt catagagctg 60 aatgataaaa ttgtacttcgagatgcactg ggactcaacg tgaccttatc aagtgagatg 120 gagtcttgcc ctgtctccaaggctggagcc caatggtgtg atcttggctc actgcaacct 180 ccacctccca ggttcaaacgtttctcctgc ctcagcctcc caagtaactg ggattacagc 240 aggcttggtg catttgacacttcatgatat cagccaaagt ggaactaaaa acagctcctg 300 gaagaggact atgacatcatcaggttggga gtctccaggg acagcggacc ctttggaaaa 360 ggactagaaa gtgtgaaatctattagtctt cgatatgaaa ttctctgtct ccgtaaaagc 420 atttcatatt tacaagacacaggcctactc ctagggcagc aaaaagtggc aacaggcaag 480 cagagggaaa agagatcatgaggcatttca gagtgcactg tcttttcata tatttctcaa 540 tgccgtatgt ttggttttattttggccaag cataacaatc tgctcaaaaa aaaaaaatct 600 ggagaaaaca aaggtgcctttgccaatgtt atgtttcttt ttgacaagcc ctgagatttc 660 tgaggggaat tcacataaatgggatcaggt cattcattta cgttgtgtgc aaatatgatt 720 taaagataca acctttgcagagagcatgct ttcctaaggg taggcacgtg gaggactaag 780 ggtaaagcat tcttcaagatcagttaatca agaaaggtgc tctttgcatt ctgaaatgcc 840 cttgttgcaa atattggttatattgattaa atttacactt aatggaaaca acctttaact 900 tacagatgaa caaaccccacaaaagcaaaa aatcaaaagc cctacctatg atttcatatt 960 ttctgtgtaa ctggattaaaggattcctgc ttgcttttgg gcataaatga taatggaata 1020 tttccaggta ttgtttaaaatgagggccca tctacaaatt cttagcaata ctttggataa 1080 ttctaaaatt cagctggacattgtctaatt gttttttata tacatctttg ctagaatttc 1140 aaattttaag tatgtgaatttagttaatta gctgtgctga tcaattcaaa aacattactt 1200 tcctaaattt tagactatgaaggtcataaa ttcaacaaat atatctacac atacaattat 1260 agattgtttt tcattataatgtcttcatct taacagaatt gtctttgtga ttgtttttag 1320 aaaactgaga gttttaattcataattactt gatcaaaaaa ttgtgggaac aatccagcat 1380 taattgtatg tgattgtttttatgtacata aggagtctta agcttggtgc cttgaagtct 1440 tttgtactta gtcccatgtttaaaattact actttatatc taaagcattt atgtttttca 1500 attcaattta catgatgctaattatggcaa ttataacaaa tattaaagat ttcgaaatag 1560 aaaaaaaaaa aaaaatcta1579 3 1819 DNA Homo sapiens 3 tccctcttgc gttctgcaat ttctgaaaaaaagatgttta ttgcaaagtg atatgagcac 60 tggaaaggta ctaattccaa tttgattctaattggatgag tgacatgggt aagcgattct 120 aagcatttgt gtttttttta gtagtatggaatttaattag ttctcagtat gttagtgaag 180 atgaatgaaa acatgcatat gtttccatgtattataaata ttttaaaatg caaaaaatta 240 ttctaatgaa tatataaata taaagcataacaataataat acaataccac ccataaagtc 300 atcatctaat ttaaaaacta aaacattaacacttgaatct cccccattgc aacatctttc 360 ccgacttgtg tgtttttttc ttttgcttttaaaatttttg ttttatcata tgtctgcata 420 agattatata gctttccttg ttttaagctttttaaataat atattgtagt tatattattt 480 gtgctttgct ttttttactt aacattatggttctaaaatt cagtaatgtg ttgggcatgt 540 ataatttgtt tatttttaat ctctttgacattcgactata taaatttcag tttgtttatt 600 gactcctttg tctatagata ctctgctatttctgtttttg ctgttacaaa aataatgctg 660 ttttaaattt cattttgtat acttttttgaggcatgtgta tgagttattc taaggtaaaa 720 aaataagaaa aaattgctgg gttataagattgtcacatgc tcgaatttac aagataatgc 780 caaatcattt ttcaaagtaa ttatacctatttatactacc ggtatgagta tattggtgcc 840 cacatagttg cttgttctgc caaagtttggtatgatcgaa caataatttt tgcccatcaa 900 atggcataaa ataaaatctc agtgtgcttttaatttgcat tttctatgtt taagaattgt 960 ttctttttta accatttata atttacttttgctgaaatgc ttgcttatta tttttgctcc 1020 ccattttttc ctattggatt gcttttctcattaatttata agaattttat atggtttaga 1080 tactaattat tatattactg aaaatacctttatcagtttg ttgtgtactt tctactttat 1140 gtcttgtgat ggataaaagt tttaaattgtattgtgttga agttaacatt tttaaatttt 1200 ataatcagca tctttaataa tctctttmtaaaattttcct ttacatagat gtcataaaga 1260 tacatctcta taatttctta tttttttggcatatgttcat taagtcattt tatcattttt 1320 tagtaataaa ttgcagttat ttatgaaacaaataattttt aaaattatat atgctttctt 1380 taaaaattga tcttagcatg cttcactatgaagcttgagg cttcactgca cgttgtactg 1440 aaattatgta taaaacagtg gttctgaaaatctctgagtt catgacacct ttagtgtctc 1500 aggttttttt gcttttgttc ttgttttttctcacaaagca cctaagttaa ataaaaacaa 1560 agcacaaagc tatcagcttc atgtattaagtagtaagctc ccatgttaac agttgtaact 1620 tgcctggtgc ccaatagatg tcactctgttttcctagaaa ctttaaaata tccctcagtg 1680 ctcctgttaa ttcatggtag tgccccaaggcactctggca cccagttttg gaactgcagt 1740 tttaaaagtc ataaattgaa tgaaaatgatagcaaaggtg gaggttttta aagagctatt 1800 tataggtccc tggacagca 1819 4 1025DNA Homo sapiens 4 ttttttcaat taggcagcaa cctttttgcc ctatgccgtaacctgtgtct gcaacttcct 60 ctaattggga aatagttaag cagattcata gagctgaatgataaaattgt actacgagat 120 gcactgggac tcaacgtgac cttatcaagt gagcaggcttggtgcatttg acacttcatg 180 atatcatcca aagtggaact aaaaacagct cctggaagaggactatgaca tcatcaggtt 240 gggagtctcc agggacagcg gaccctttgg aaaaggactagaaagtgtga aatctattag 300 tcttcgatat gaaattctct gtctctgtaa aagcatttcatatttacaag acacaggcct 360 actcctaggg cagcaaaaag tggcaacagg caagcagagggaaaagagat catgaggcat 420 ttcagagtgc actgtctttt catatatttc tcaatgccgtatgtttggtt ttattttggc 480 caagcataac aatctgctca agaaaaaaaa atctggagaaaacaaaggtg cctttgccaa 540 tgttatgttt ctttttgaca agccctgaga tttctgaggggaattcacat aaatgggatc 600 aggtcattca tttacgttgt gtgcaaatat gatttaaagatacaaccttt gcagagagca 660 tgctttccta agggtaggca cgtggaggac taagggtaaagcattcttca agatcagtta 720 atcaagaaag gtgctctttg cattctgaaa tgcccttgttgcaaatattg gttatattga 780 ttaaatttac acttaatgga aacaaccttt aacttacagatgaacaaacc cacaaaagca 840 aaaaatcaaa agccctacct atgatttcat attttctgtgtaactggatt aaaggattcc 900 tgcttgcttt tgggcataaa tgataatgga atatttccaggtattgttta aaatgagggc 960 ccatctacaa attcttagca atactttgga taattctaaaattcagctgg acattgtcta 1020 attgt 1025 5 21 DNA Artificial SequenceDescription of Artificial SequenceProbe/Primer 5 tgcctcagcc tcccaagtaa c21 6 21 DNA Artificial Sequence Description of ArtificialSequenceProbe/Primer 6 ggccaaaata aaaccaaaca t 21 7 19 DNA ArtificialSequence Description of Artificial SequenceProbe/Primer 7 tggcaacaggcaagcagag 19 8 11801 DNA Homo sapiens unsure (7470) n may represent anyof the four nucleotide bases 8 tccctcttgc gttctgcaat ttctgaaaaaaagatgttta ttgcaaagtg atatgagcac 60 tggaaaggta ctaattccaa tttgattctaattggatgag tgacatgggt aagcgattct 120 aagcatttgt gtttttttta gtagtatggaatttaattag ttctcagtat gttagtgaag 180 atgaatgaaa acatgcatat gtttccatgtattataaata ttttaaaatg caaaaaatta 240 ttctaatgaa tatataaata taaagcataacaataataat acaataccac ccataaagtc 300 atcatctaat ttaaaaacta aaacattaacacttgaatct cccccattgc aacatctttc 360 ccgacttgtg tgtttttttc ttttgcttttaaaatttttg ttttatcata tgtctgcata 420 agattatata gctttccttg ttttaagctttttaaataat atattgtagt tatattattt 480 gtgctttgct ttttttactt aacattatggttctaaaatt cagtaatgtg ttgggcatgt 540 ataatttgtt tatttttaat ctctttgacattcgactata taaatttcag tttgtttatt 600 gactcctttg tctatagata ctctgctatttctgtttttg ctgttacaaa aataatgctg 660 ttttaaattt cattttgtat acttttttgaggcatgtgta tgagttattc taaggtaaaa 720 aaataagaaa aaattgctgg gttataagattgtcacatgc tcgaatttac aagataatgc 780 caaatcattt ttcaaagtaa ttatacctatttatactacc ggtatgagta tattggtgcc 840 cacatagttg cttgttctgc caaagtttggtatgatcgaa caataatttt tgcccatcaa 900 atggcataaa ataaaatctc agtgtgcttttaatttgcat tttctatgtt taagaattgt 960 ttctttttta accatttata atttacttttgctgaaatgc ttgcttatta tttttgctcc 1020 ccattttttc ctattggatt gcttttctcattaatttata agaattttat atggtttaga 1080 tactaattat tatattactg aaaatacctttatcagtttg ttgtgtactt tctactttat 1140 gtcttgtgat ggataaaagt tttaaattgtattgtgttga agttaacatt tttaaatttt 1200 ataatcagca tctttaataa tctctttataaaattttcct ttacatagat gtcataaaga 1260 tacatctcta taatttctta tttttttggcatatgttcat taagtcattt tatcattttt 1320 tagtaataaa ttgcagttat ttatgaaacaaataattttt aaaattatat atgctttctt 1380 taaaaattga tcttagcatg cttcactatgaagcttgagg cttcactgca cgttgtactg 1440 aaattatgta taaaacagtg gttctgaaaatctctgagtt catgacacct ttagtgtctc 1500 aggttttttt gcttttgttc ttgttttttctcacaaagca cctaagttaa ataaaaacaa 1560 agcacaaagc tatcagcttc atgtattaagtagtaagctc ccatgttaac agttgtaact 1620 tgcctggtgc ccaatagatg tcactctgttttcctagaaa ctttaaaata tccctcagtg 1680 ctcctgttaa ttcatggtag tgccccaaggcactctggca cccagttttg gaactgcagt 1740 tttaaaagtc ataaattgaa tgaaaatgatagcaaaggtg gaggttttta aagagctatt 1800 tataggtccc tggacagcat cttttttcaattaggcagca acctttttgc ctatgccgta 1860 actgtgtctg cacttcctct aattggggtgagtaagagat tttgttatgt atataatagc 1920 taagaatata gtaataatgg cttaaatcatggttattttt aaactactaa catttagaag 1980 acaaaataaa aatgctttga aaagtatagaggttttagtg taattagcag ggaataatga 2040 aatgatttga tagggctact cagttttgtataactttggt gctttaagtc tgaatgcaga 2100 gcatggatgt tgtgatccag cctttatatgttttccctga agaagattta atttatttgg 2160 ccttttgaga aacacatttg gcattgtaatatgttttgct tccaggttct atctccaagg 2220 ataatttgac aaaatcacac ataaatttattttcagggca cacagtttcc cttttaggga 2280 actcacagag gtagagagta atacaataatcacatttgaa tattcagtaa gtgaggtcct 2340 catagatctt atgtgtatgt caccatgtatataattttgt taatcactag atgtatgaga 2400 caagaaattt gaggaatctt aactagagattaaaatcagg gatttaaatc aaagaaacat 2460 ttaaatgcct cctttattat ttaaatacctgcatgggaga atcattgaaa aaaaaataaa 2520 aagcatacaa cttgggaata ttataaaccaagaagaattt gttattctgg ttgatttttt 2580 tttcaggctc cgcacaggca acttacctttatctctttgt gatttttatt tcttgttaaa 2640 atatacagaa atagttaagc agattcatagagctgaatat aaaatttact acgagatgca 2700 ctgggactca acgtgacctt atcaagtgacttatcagtga ggtgagcatt cttaattcag 2760 ataatggaac ttattatcat aatcttttgcttatgctatt gttgagctta actacttatt 2820 catatttgca tatgcatatt gagataatatcatttcatta atttcagtac tgaacactaa 2880 tctcctaaga gtaattgtga aagtttcagattgcactatt tttaactata tatctgtatg 2940 ttatcttcat atatgcttga ataacttataagcaattgaa actttcaatt acagtatact 3000 attgaagcaa atcaactaat atatacacatatccattagc aatagtagat aatttttgta 3060 aatgtccagc acagttcttc atatgtagaggatgttcaaa ttggctaagt tccttttctc 3120 tcttaattat tagtattttt cctactgctctttgtataat tattccttcc tctttagctc 3180 caatccttac aatctattct taacatagcaactgggaaga aagtttttaa acataaacca 3240 gatgatgtca ctccacccca caaaacttccactattctct gtcacacata gaaagaaaga 3300 aaaaaaatat tgaaaaccta caaagacttgctatgatctg gtccaggctc tccctaaaat 3360 ttcatgtaat ttccagccac taggcctttctggctctcct tcaatctcat tagccttttc 3420 actactacaa gttagactgg gttttggccgaggtatttct ttttttcata ttttgccttt 3480 gcctagattg ctcttccaat agatattcacaattgcatca tcatttctat atacgtgcta 3540 aaaggtttcc ttgtccaaaa tagcttcagtgaccacctga tctagaatag tctcgatcaa 3600 aagtttcttt tccttttcct caccacttgatatttatatc aaacatttat ttgtgtaatt 3660 tatgtgtttg tttgttttct gtactagcattatgatgacc atactatttg atgcccccca 3720 aaaaatactt tcgagaatga cagggcaaagctaaaataat taaattatat aattttgaca 3780 taggcactat tgacaaaaag caattgatgttatgatagtg ttagatctat gaaatagtac 3840 tatttaaaag taattctctg aaatacaattttctaaaact aaaagcagca tatgtacatg 3900 aaacaccaaa aaacttcctt atatttatcactggaagatt taaaatagta taagtagtaa 3960 cttatttaat atatttttga ttatttaattaattttatag tatccaactc taatataatg 4020 ccagtggtat ttgttcaaaa tattttaatgttgtctattt atttttaatt tgcctaaaaa 4080 ttatcttaaa tgaaaatttt tggttaataaatttgaaaat actgaaaccc tcatctccag 4140 tctctgtgga tcctaaagtt tttagttgagaaaataattt ttctctagag aatgaagtag 4200 cttgtaagct tggagaaatt tctgctaaataaatgatatt atcaactctt attttcttca 4260 atacgaaata tataaatatt tcagctcatatatttttgca ggtgctatgc ttttgcttcc 4320 aatcataatt tctgacaaat attttggaagtcaaaacttg tcttctattt tgttatttaa 4380 aattatatag actacttttg taaacctttatactatcaaa tcataggcaa tttcagtttg 4440 atttcattct ggtgcagaat ataagtttatccaagtaaaa caggagtcac ttcaaaagat 4500 tcctcccact gactgagata ttccaaagccaactttgcaa aatttcagaa ttaaatatta 4560 tacttctttg taccttcatt ttatttgttcaatttttctt tgtgtttgta gaaaatttta 4620 atatttttct gttttcaagt tttgattttaatttactact ttataatttt taaaggtaag 4680 ttttgtgagg ctatattcat tatgtgttttgaataaagac atacaattaa ttttgagaac 4740 tgcaataaaa attataagac tattaaaaatgcagtaagtg tactacactt aggctgctaa 4800 aaatgcagta ccagtagact acatttaggctgcttaaagt tagttcttct aagtaccata 4860 tactttaaaa ttttagctaa tgatggagaacaaagacaga aagactgtgt taccatattc 4920 tagttggcca ttttgttttg ttttgagagacgtcacatca gccttatcat aaaaattatt 4980 tggttttacc attttgactg tgagcaaaatatacagcata atatacaaaa taaaatatat 5040 gtacatcttc acaacttctt gtttaggatgcaattatata tatatatata tatatattta 5100 ttattatact ttaagttcta gggtacatggcaccacgtgc aggttgttac atatgtatac 5160 atgtgccatg ttggtgtgct gcacccattaactcgtcatt tacattaggt gtatctccta 5220 atgctatccc tcccctctct ccccaccccacaacaagccc cggtgtgtga tgttcccctt 5280 cctgtgtcca tgtgttctca ttgttcaattcccacctatg agtgagaaca cgcagtgttt 5340 gcttttttgt ccttgcaata gtttgctgagaatgatggtt tccagcttca tccatgtccc 5400 tacaaaggac atgaactcat cattttttatggctgcatag tattccatgg tgtatatgtg 5460 ccaccatttt cttaatccga gtctgtccattgttgttgga catttgggtt gcaattttga 5520 gtttcatgtg tagcatgtat agcacaaccaattaagattt ctttctttct ctcttttttt 5580 tttttttttg ttgaaatgga gtcttgcctgtctccaaggc tggagcccaa tggtgtgatc 5640 ttggcttact gcaacctcca cctcccgggttcaagcgatt ctcctgcctc agccatccga 5700 gtagctggga ctataggcgt gcaccaccatgcccagctaa tttttgtatt tttagtacag 5760 acggggtttc accacggtgg ccaggatggtctcaatttct tgacctcatg attcacccgc 5820 cttggcctcc caaagtgctg ggattacaggtgtgaaccac caagcccggc ctgtcacaag 5880 tttttagtgt tctattttaa tacagaaattagataaatcc aaagagaaag acatttcata 5940 tgtgcgtaga gttgtcggaa gaaatgagagtcttataaat aactttaaaa attgtgaaga 6000 aataaaggca aaatagtcct atgcagtttgatttaaatat attcttaata agagctactt 6060 ttgtgaaaac cagaatattg aaacatgtagatatggatct tcattagtga ctgacataat 6120 atattgttat tgttactatt ttattgtatcagccaactaa tattgagtgc tttgtgtatc 6180 ctaagcacta tgctaaacac tgtaccagtattacctgata taatcatatt aatatttatt 6240 atttcacttt tcatatgaaa aaattgaagcacagattaag acactccgaa atcatacctc 6300 tattgattat cagcaccagg atttgaattgaggcactctg atccagagaa gcttttgttt 6360 ccatgaaggc ttatgttggg gaaaaataatcaaattgcct gtacctcagt tgtataaata 6420 agaggttggg ttggtagatg attctggctgattcagcaga aaagaaattt attcaaagga 6480 tatcacacag ttttcataac agttaagaatacagaggaaa cagggcacca gggctaagta 6540 cagaccaaag tccaaaacca ctgccaaagttgcagcaagg agaacagcac aaatttgctt 6600 gctgtcaccc gccactagat gcttttgtttggagccttga acttgactta cactgccact 6660 gacatcagca ccagtgctct ctgtgtactaggaggtggag ttggtgacgt tgctgaactc 6720 aaagcagatg tttctgctgt gaaatagatacctaatacag aacctgcttc ctcattcatt 6780 ccctccccaa atcatatgct tgtagtgtggctagagtttc tgtttctcct tggtccaggc 6840 agaatttatg aagcttgcta tttatcgccttaaagattag aagaatattc ataaggtatt 6900 agattgccat aaggttgaac aaatcaacattcaacttcaa ggattcaaca ttgttttgtt 6960 ttcttttggg atacctctgc agcagttcaaatcttatttc tgcccttgga caaccaggtt 7020 tataaatatt gcagattctc cactgactgctttgatccta tcttctatat ttatgtatac 7080 taattagcat ataataaaag attatgttacagaatctcaa aattagtaat tatgaattga 7140 gatggtgtta tacagtacac taacatccaagagacttgtt tattccaagg aaaatattta 7200 gagatattaa atgatatttc tcatcctttagacatataca ttttttagct tacagcctgc 7260 tttaggcaag caacagactc tcaggatctgctcctaccag ggtctgaaca tttcctccca 7320 gttttaaaga aacaaattca aataacattgtaacctccag aggaaagttc aagctctttt 7380 atagtattgt ttaaacagta cagctgaggaaactaaagac agagaagtta aatgccttgg 7440 cacttagtct agatttacaa taaactcctntctacttagg acccactaac aggggctgca 7500 tttacaccaa aaccatgaag gtggcccaagtcatcactga gaagtagtac aagcaccgag 7560 ggaatgactt caacaggaac aagaaagcgtggaaggagat cctagcagga agctccacaa 7620 gaagatagca tgttacgtct tgcattggatgaagcaggtt cagagagacc tagtgacagc 7680 tatctccgtc aaggtgcaga aggagagatcattgaatgta gcattttcat gcaaaaaaaa 7740 aaatgttgaa gtctttggac ttcgggagtctgtccaaact gcaggtcact cagcctacag 7800 ttgggatgaa tttcaaaaca ccagttggagccggttgaat ctttctgcta tgctgtaata 7860 ttttcagtaa acccagcgca acaacaacaacaaaacacaa aaggaggaga agcagccaag 7920 tctcttggtt tacagagtag ctcctaataccccttgctgt ctgtctcaag tgcccaatgg 7980 gaagatagtc aaaacaatat tcacacctgtgattcatctc tctacatgca gtgtgtgtga 8040 atctttatat actgcatatt aaggatctgtctttacagat aaaaactaaa gcattgaagg 8100 aactccttgt tttgacttat caaagtccttaagaaaatac tagaaaatta tagccattgt 8160 ttcaaatttt agctttatat tatcacttgaaatgtgatga aatgtggctg atagataata 8220 attcactgat aacctacaga caattcccatcttaaaatgg accattggat tgaagaatta 8280 aataaaattg agggttttcc ttacatgttttgtctaaaga gcgaagtaga aacaactgtt 8340 catagatctt cattgaggat tcgcatgtgaagtaagtact cctaacataa acaagtggac 8400 ttatcaacca agttccataa atcatgaacaaaaatatttg tccccagaga gactattttt 8460 ccaccacatc tcttgtaata aacacagagcccagttcagt taaaatagtt taagggtgga 8520 cggttcaggg cctgctgagt ggcactcagtaagaaaaccc agcagaacat ttacttctct 8580 ctttattcca gagcatcaat ggccaaggctggaagatccc agaacactga acagacattt 8640 ggtctcttat ggcctgccaa ttttcacagtgggttccaac gctttgggtc aaaccaaaat 8700 agacctgtta gaaaaatgtc ggttggaatacgctaacaat aagacagaat aaatgtgatt 8760 atttcacctc atttttatag gacttgagtaattttattat aacattcttg agggctggaa 8820 aatctgaatg ttaggacacc aaatatctccagaaaacaag ttttatattt ctaatcctgc 8880 ataataaacc tggggccact gcaggcctcattaataaaaa cctaatggta taacaataat 8940 gaggaggaaa tgccaatgcc gcacaaatctgttgagacta aaatatttct caccccagca 9000 ggcttggtgc atttgacact tcatgatatcagccaaagtg gaactaaaaa cagctcctgg 9060 aagaggacta tgacatcatc aggttgggagtctccaggga cagcggaccc tttggaaaag 9120 gactagaaag tgtgaaatct attagtcttcgatatgaaat tctctgtctc tgtcaaaagc 9180 atttcatatt tacaagacac aggcctactcctagggcagc aaaaagtggc aacaggcaag 9240 cagagggaaa agagatcatg aggcatttcagagtgcactg tcttttcata tatttctcaa 9300 tgccgtatgt ttggttttat tttggccaagcataacaatc tgctcaagaa aaaaaaatct 9360 ggagaaaaca aaggtgcctt tgccaatgttatgtttcttt ttgacaagcc ctgagatttc 9420 tgaggggaat tcacataaat gggatcaggtcattcattta cgttgtgtgc aaatatgatt 9480 taaagataca acctttgcag agagcatgctttcctaaggg taggcacgtg gaggactaag 9540 ggtaaagcat tcttcaagaa tcagttaatcaaagaaaggt gctctttgca ttctgaaatg 9600 cccttgttgc aaatattggt tatattgattaaatttacac ttaatggaaa caacctttaa 9660 cttacagatg aacaaaccca caaaagcaaaaaatcaaaag ccctacctat gatttcatat 9720 tttctgtgta actggattaa aggattcctgcttgcttttg ggcataaatg ataatggaat 9780 atttccaggt attgtttaaa atgagggcccatctacaaat tcttagcaat actttggata 9840 attctaaaat tcagctggac attgtctaattgttttttat atacatcttt gctagaattt 9900 caaattttaa gtatgtgaat ttagttaattagctgtgctg atcaattcaa aaacattact 9960 ttcctaaatt ttagactatg aaggtcataaattcaacaaa tatatctaca catacaatta 10020 tagattgttt ttcattataa tgtcttcatcttaacagaat tgtctttgtg attgttttta 10080 gaaaactgag agttttaatt cataattacgttgatcaaaa aattgtggga acaatccagc 10140 attaattgta tgtgattgtt tttatgtacataaggagtct taagcttggt gccttgaagt 10200 cttttgtact tagtcccatg tttaaaattactactttata tctaaagcat ttatgttttt 10260 caattcaatt tacatgatgc taattatggcaattataaca aatattaaag atttcgaaat 10320 agaatatgtg aattgttcac catacatagaaatgaaaagt tcatttcgta aagcaagatg 10380 ctgggtgaaa gagtgctttt gattgaaagatcactagatt agtagagggc aagactttta 10440 gtccctaatc tacccttaat agccatgtggtcacgtgtaa gtcagtgaac ccatctcatt 10500 ctcctcatac ttttttcatc tctaaaatgagggtataatt taagctcgtt catttttttt 10560 tttttttgag atagagtttt gctcttgtcacccaggttgg agtgcaatgg cacgatctca 10620 gctcactgca accctctgct tcctcggttcaagtgattct ccctgcttca gcctcccaag 10680 tgagcccggg attacaggtg cccgccaccacatctgggcc tagatttttt gtattttcac 10740 catgttggcc aggctggtct cgaacccctacctcaggtga tccctcgcct cggcctctca 10800 aagtgctggg attacaggtg tgagccaccacgcccagccc aatatcagtt tttctttttt 10860 aacacaaggc taacacaatc aaaatactagctaggggaga aaaaaaaaat aaggcactgt 10920 ttatgtgtaa caggctcttg ttgcaatccactggggcaga ccaaataaac agtaagaatc 10980 aaatcctttt catataatcc tttctttgcagaatacataa aatccccaca aatggcttat 11040 cttccttttt atgatatgtt ggagaattgtagctaagtga cagatatttt gcttgggtgt 11100 atagaccaca aaggactgtg tcttgatgatggtttgcata aaattatacc ttagttttta 11160 ctttgtatgt tacatgttag atttagagtatgaaaattag tagggaggat tattaacaaa 11220 gaacagggca agaggagtag aattaaacctcttctaatac ctgtgcacaa gtaggctttt 11280 cagaaactct acaaccccaa cataaactggatagttagaa aagcacactc ccaaggaagg 11340 cggttatgtt ttgcagtttg aatcagaagaatagagctat agcaatcttc attctatagt 11400 aacattaaag agcctggttt atattatagcagtcattaag atttaaaaat ttacatcttg 11460 ccgttcttct tactcacaga ttttcgagaggtaatgtaat gatcacacga ggtgagaatc 11520 actgcctttt ataatgcgat taaatgcatgaacaaagttt ccaacaaata acagtaataa 11580 aaagaaacat gtattagcac ttaataagccaggtgctgta cgacgtgtgt tacatgcttt 11640 caatccatga actggtaaac tggtactagtatctctattg gacatgtgag gaaaccaaat 11700 ggagttgata aacagtagag ttaaaaattactcttcatat attatattgc ctcaatctca 11760 cagacatctc tgctaccaaa agctatcatatctagactcg a 11801 9 19 DNA Artificial Sequence Description ofArtificial SequenceProbe/Primer 9 tggcaacagg caagcagag 19 10 21 DNAArtificial Sequence Description of Artificial SequenceProbe/Primer 10ggccaaaata aaaccaaaca t 21 11 24 DNA Artificial Sequence Description ofArtificial SequenceProbe/Primer 11 gcaaatatga tttaaagata caac 24 12 25DNA Artificial Sequence Description of Artificial SequenceProbe/Primer12 ggttgtatct ttaaatcata tttgc 25 13 27 DNA Artificial SequenceDescription of Artificial SequenceProbe/Primer 13 actgtctttt catatatttctcaatgc 27 14 24 DNA Artificial Sequence Description of ArtificialSequenceProbe/Primer 14 aagtagtaat tttaaacatg ggac 24 15 21 DNAArtificial Sequence Description of Artificial SequenceProbe/Primer 15tttttcaatt aggcagcaac c 21 16 25 DNA Artificial Sequence Description ofArtificial SequenceProbe/Primer 16 gaattgtctt tgtgattgtt tttag 25 17 26DNA Artificial Sequence Description of Artificial SequenceProbe/Primer17 caattcacaa agacaattca gttaag 26 18 23 DNA Artificial SequenceDescription of Artificial SequenceProbe/Primer 18 acaattagac aatgtccagctga 23 19 24 DNA Artificial Sequence Description of ArtificialSequenceProbe/Primer 19 ctttggctga tatcatgaag tgtc 24 20 23 DNAArtificial Sequence Description of Artificial SequenceProbe/Primer 20aaccttttgc cctatgccgt aac 23 21 22 DNA Artificial Sequence Descriptionof Artificial SequenceProbe/Primer 21 gagactccca acctgatgat gt 22 22 20DNA Artificial Sequence Description of Artificial SequenceProbe/Primer22 ggtcacgttg agtcccagtg 20

We claim:
 1. An isolated nucleic acid molecule selected from: (a) thepolynucleotide sequence of SEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:8; (b)an isolated nucleic acid molecule that hybridizes to either strand of adenatured, double-stranded DNA comprising the nucleic acid sequence of(a) under conditions of moderate stringency in about 50% formarnide andabout 6×SSC at about 42° C. with washing conditions of approximately 60°C., about 0.5×SSC, and about 0.1% SDS; (c) an isolated nucleic acidmolecule that hybridizes to either strand of a denatured,double-stranded DNA comprising the nucleic acid sequence of (a) underconditions of high stringency in about 50% formamide and about 6×SSC,with washing conditions of approximately 68° C., about 0.2×SSC, andabout 0.1% SDS; (d) an isolated nucleic acid molecule derived by invitro mutagenesis from SEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:8; (e) anisolated nucleic acid molecule degenerate from SEQ ID NO:1, SEQ ID NO:2,or SEQ ID NO:8, as a result of the genetic code; and (f) an isolatednucleic acid molecule selected from the group consisting of human PCGEM1DNA, an allelic variant of human PCGEM1 DNA, and a species homolog ofPCGEM1 DNA.
 2. A recombinant vector that directs the expression of thenucleic acid molecule of claim
 1. 3. A host cell transfected ortransduced with the vector of claim
 2. 4. The host cell of claim 3selected from bacterial cells, yeast cells, and animal cells.
 5. Anisolated nucleic acid molecule comprising the polynucleotide sequenceselected from SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, and SEQID NO:
 22. 6. A method of detecting prostate cancer in a patient, themethod comprising: (a) detecting PCGEM1 mRNA in a biological sample fromthe patient; and (b) correlating the amount of PCGEM1 mRNA in the samplewith the presence of prostate cancer in the patient.
 7. The methodaccording to claim 6, wherein step (a) includes: (a) isolating RNA fromthe sample; (b) amplifying a PCGEM1 cDNA molecule; (c) incubating thePCGEM1 cDNA with the nucleic acid according to claim 1 or 5; and (d)detecting hybridization between the PCGEM1 cDNA and the nucleic acid. 8.The method according to claim 7, wherein the PCGEM1 cDNA is amplifiedwith at least two nucleotide sequences selected from SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16,SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO:21, and SEQ ID NO:
 22. 9. The method according to claim 8, wherein theat least two nucleotide sequences are SEQ ID NO:15 and SEQ ID NO:22. 10.A method according to claim 6, wherein the biological sample is selectedfrom blood, urine, and prostate tissue.
 11. The method according toclaim 10, wherein the biological sample is blood.
 12. A vector,comprising a PCGEM1 promoter sequence operatively linked to a nucleotidesequence encoding a cytotoxic protein.
 13. The vector of claim 12,wherein the PCGEM1 promoter sequence is a nucleic acid moleculecomprising the polynucleotide sequence of SEQ ID NO:3.
 14. A method ofselectively killing a prostate cancer cell, the method comprising: (a)introducing the vector according to claim 12 to the prostate cancer cellunder conditions sufficient to permit selective cell killing.
 15. Themethod according to claim 14, wherein the cytotoxic protein is selectedfrom ricin, abrin, diphtheria toxin, p53, thymidine kinase, tumornecrosis factor, cholera toxin, Pseudomonas aeruginosa exotoxin A,ribosomal inactivating proteins, and mycotoxins.
 16. A method ofidentifying an androgen-responsive cell line, the method comprising: (a)obtaining a cell line suspected of being androgen responsive, (b)incubating the cell line with an androgen; and (c) detecting PCGEM1 mRNAin the cell line, wherein an increase in PCGEM1 mRNA, as compared to anuntreated cell line, correlates with the cell line being androgenresponsive.
 17. A method of measuring the responsiveness of a prostatetissue to hormone-ablation therapy, the method comprising: (a) treatingthe prostate tissue with hormone ablation therapy; and (b) measuringPCGEM1 mRNA in the prostate tissue following hormone ablation therapy,wherein a decrease in PCGEM1 mRNA, as compared to an untreated cellline, correlates with the prostate tissue responding to hormone ablationtherapy.